Sunday, December 30, 2012

Coursera and the analytics talent gap

It's been a while, and ThinkOR is back to blogging about Operational Research and its related themes.

ThinkOR authors are about to start on 3 Coursera courses over the next couple months:

I am not only learning about some new topics for my own benefit, but also interested in assessing how such easily accessible courses could help the so-called 'big data and analytics talent gap' in businesses. As a Business Analytics consultant, this is one of the biggest issues I see my clients facing in today's business world - one wouldn't think about it, if they don't know about it, and once they know about it, they don't know how to get more of it. Obviously, there would need to be some sort of a step progression, such as (just an example without much research at this point):
  1. Statistics One
  2. Data Analysis (with R) and/or Computing for Data Analysis
  3. some sort of programming course, check the computing course catalogue
  4. Focus on one or several of the main OR techniques and their associated tools, such as Discrete Event Simulation, Monte Carlo Simulation, Optimisation, Forecasting, Machine Learning, and the good old Volumetric Modelling, as some examples
  5. and if you are going to work with humongous data sets, Intro to Data Science sounds reasonable to become familiar with the various big data technology to apply data science (I suspect this often eludes traditional OR practitioners)
As ThinkOR goes along, we will be blogging about these courses and our learning experience. So far, there has only been very positive feedback. Let's get going!

Merry Christmas and Happy New Year!

Thursday, May 31, 2012

Consistent Education Divide in Cities

The Daily Viz brought this to my attention. It's a visual by the New York Times showing how the distribution of cities by proportion of adults with college degrees has changed over the last 40 years.

Nicely formatted and presented, though my ability to compare the distributions side-by-side is a little bit limited.

The key story that this visual is telling is that the average has moved from 12% to 32%, but that the number of cities more than 5% above or below the average has increased substantially. "College graduates are more unevenly distributed in the top 100 metropolitan areas now than they were four decades ago." But i'm not sure if it's as simple as that.

Suppose I was measuring trees. One species was 10 feet tall on average and species two was 100 feet tall. If the first tended to vary between 7 feet and 13 feet, but the latter tended to vary from 85 feet to 115 feet, I wouldn't remark at how much more variable these trees were. For species one, no tree was more than 3 feet from the average, but in species two, presumably many are. Is this a sign that species two is more unevenly distributed? Not really. Species one varies up and down by 30% where two does so by 15%.

So I asked myself, given that the average proportion of adults with college degrees has nearly tripled to 32%, has their variability increased proportionally? Now that these trees are 32 feet tall, it seems strange to still measure their "unevenness" by how many of them are between 27 and 37.

So I reached out to a statistic, the Coefficient of Variation. Using my eyes to collect the data from the charts (so not precisely the correct data), I calculate a coefficient of 0.25 in 1970 and 0.22 in 2010. The variation in the data as a proportion of the average has gone down in the last four decades.

Again, the NYT concludes that "College graduates are more unevenly distributed in the top 100 metropolitan areas now than they were four decades ago.", but I would argue that if anything they are slightly more evenly spread than before and not remarkably so.

Saturday, April 28, 2012

Data Journalism



I've recently started following the Guardian's Data Blog, but I was a little disappointed with their recent article on grammar schools inthe UK.

My understanding is that grammar schools are a subset of schools in the UK that supposedly offer entry on a meritocratic basis and deliver higher quality education. Depending on your political leanings you either believe that grammar schools re-enforce the class division in the UK by giving entry disproportionately to the already higher class and then giving them a better education or you believe that grammar schools enable class mobility by delivering a better education to bright lower class students who would not otherwise afford such a thing. As an outsider in the UK I’m not qualified to hold an opinion here, but I suspect that naturally each extreme fails to appreciate some nuanced details.

The article appears to have pulled off a classic journalist's ploy:
  1. Present a statistical analysis of the data in a leading way without drawing conclusions
  2. Quote somebody else's opinion on the topic

Essentially you can deliver opinion supported by the apparent full weight of objective statistical analysis without having to put your name to the conclusions which might not hold up to rigorous challenge.

Notice also that one of the opinions is much stronger than the other. Notice also that Rosemary Joyce's note has very complex implications which are not at all explored for the reader. Even I’m not sure if she has a point.

I could offer a very different view of the same data:
  •  14 of 32 schools favoured those not privately educated, giving fewer than 6% of offers to the privately educated
  • Taking 24 of the 32 schools (3/4) with the lowest privately educated proportions, the average was 6%, the same as the overall population
  • Removing the two clear outliers in the data, "Tonbridge Grammar School", "The Judd School" overall the privately educated averaged to 8.9%


I feel the key fact I’m missing is: What % of students in Kent who scored well on the 11-plus exam were privately educated? How does this compare to the 10.89%? How does this compare to the 8.9% removing outliers? Is there a social bias in the offers?

I’m also missing any information about how these numbers have been changing with time. Simon Murphy complains that the government is not taking steps to improve the chances of poor children, and yet for all I know that 10.89% was maybe 12% last year.

What about this “local context” anyhow? How do these percentages compare at a lower level of granularity that county-wide? How do these percentages compare to applications?

Is this a story of a county-wide bias, or just the story of two bad apples and handful of not-so-good-ones? I think I know what The Guardian wants me to think. Data Journalism is still Journalism I suppose.

For my readers, I ask, why do you suppose the 10.89% number is the only one in the text of the article to two decimal places?

Thursday, March 29, 2012

My Jealous Supermarket II

Last week I wrote the Figure It Out article that was published to the Capgemini Consulting UK Operational Research team blog, My Jealous Supermarket.

I encourage to click through and read the article. To summarise, my supermarket is targeting me with discount coupons in order to maintain my loyalty which it mistakenly thinks it is losing because I am a travelling consultant and have shopped very little lately.

Anyhow, after returning from a week in the north of England followed by a weekend in Florence followed by another week in the north, it was immensely satisfying to see the "Spend £30, get £3 off" coupon roll off the receipt printer when I purchased ingredients for my meal this evening after making no purchases for two weeks. They DO care!

I wonder how this initiative is going for them? Are they successfully winning people back? Any proper initiative would have a benefits tracking element following implementation, but comparing before and after and asserting causality is always difficult. Consider myself. One day I will wrap up my project in the north and spend some time in London again. I will return to my supermarket and purchase lots of food. Success! After months of giving me coupons, they will have finally won back my favour and loyalty. Or not...

Monday, February 13, 2012

Numbers in 2011 - from More or Less podcast

One of my favourite podcasts is BBC's More or Less. At the start of 2012, they did a series on Numbers in 2011. I know it's a little late in sharing this, but here we go - enjoy.

I'm sharing with you a selection of the numbers from the 30min podcast. They are somewhat UK centric, but still worthwhile sharing.

Listen to the whole podcast here.

  1. 80%: developed world's debt to GDP ratio
  2. 1.37: cost of petro in GBP on 9 May 2011 (highest in 2011), due to duty, value added tax (20%) & exchange rate (weaker GBP against USD)
  3. 1%: BBALIBOR (interest to be paid in 3 months time) 10 Nov 2011 crossed 1%, doubling of the bank interest rate. BBALIBOR indicates the risk of money not being paid back in 3 months - a show of lower confidence/trust between banks.
  4. 2.64m: unemployment in UK by December 2011 (highest in 17 years). Note UK population is just over 62m.
  5. 900k: people today working beyond 65 years old in the UK
  6. 12,500: people celebrated their 100's birthday in 2011 in the UK; and will rise to 100,000 over the next 25 years
  7. 7bn: world population
  8. 2.5: average fertility of women on earth (babies per lifetime of earth, falling from 6 from 60 years ago), easing on the environment I suppose
  9. 3,000gbp: cost of sequencing the human genome; in 2003, the first sequencing of human genome cost 600m GBP - that's a 200,000 fold reduction in cost in 8 years
  10. 2 weeks: to sequence 5 human genomes in 2010; in 2003, it took 10 years for one

Tuesday, January 3, 2012

School uniforms in developing countries: An unnecessary evil? - High-level test

Earlier I wrote a post about the requirement for school uniforms in developing countries and how I saw this as a potentially offensive injustice. I completed the first step by forming my hypothesis, "The unnecessary requirement for school uniforms in developing countries puts undue financial stress on families already struggling to afford basic necessities and/or tuition, and potentially even excludes some children from attendance." Now I am looking to test that hypothesis quickly at a high level. I want to do some research to gain reasonable assurance that the hypothesis is correct before I might move on to establish the magnitude of the problem.

Schools for Africa is a UK Registered Charity mainly focused on building schools, but who also say: "£40 will buy 10 sets of primary school uniforms". To put this into perspective:
  • They also say: "£235 will buy 50 text books for the children to share". That's £4.70 per textbook vs. £4.00 per uniform.
  • £4 is about the same as an average day's wages in Ghana
  • £4 is about the same as an average week's wages in Ethiopia
  • I choose these countries as I visited them in 2011, but it is worth noting that Wikipedia reports school uniforms as required in Ghana
The folks at Project Ethiopia, an American 501(c)(3) have reportedly bought 1,695 school uniforms at $8 a piece. These uniforms are also said to last two years, so that's an annual cost of only $4. They make the relevant point that these uniforms are the only set of clothing for many, which would lower the additional burden of the uniform requirement on top of that for clothes. Note, however, that $8 is more than a week's wages as calculated above. Again for perspective:
  • They also claim to buy over library books for $3 a piece
  • They also claim to buy a years school supplies (5 exercise books, 1 pen, bar of soap) for $3
Gift Ethiopia, a UK Charity will provide an Ethiopian school uniform for £8, describing it as such:
Without a uniform, many children in Ethiopia are unable to attend school. Many families, especially larger ones, struggle to provide a uniform for all their children. These children are denied an education and the chance to socialise with children their own age. Your gift will provide a student with a brand new, full school uniform, ensuring they can take their place in the classroom with pride.
  • £8 for a school uniform is about the same as they say it will cost to provide a school dinner for over 10 weeks
The first program listed on the website for Common Threadz, a 501(c)(3) American non-profit, is "School Uniforms for Orphans & Vulnerable Children". They describe the problem:
For families facing the challenges of poverty in Africa, school clothes are not as crucial as the next meal. The direct costs of education, from a uniform and shoes to books and stationery, force millions of orphans and vulnerable children to miss out on school each year. For a child in need from a poor rural family who may only own one pair of old pants or a tattered dress, a school uniform is not just a requirement, but essential to build confidence and academic success.
World Vision UK runs MustHaveGifts, and sells a pretty smart looking Pakistani school uniform for £12.
  • The uniform is described thusly:
    • Pakistan: Children who can't afford a compulsory school uniform can be denied the right to an education, leaving them vulnerable to exploitation. With a school uniform, children can attend school for the very first time and get on the path to a brighter future.
  • At $2,500 USD per capita PPP GDP de-adjusted to remove PPP is $941 or £1.65 per day or almost £12 per week
Based on the above I think that we can conclude that there is reasonable evidence to suggest that in parts of the developing world school uniforms are comparatively expensive and a prerequisite to education.

The next step, though I may not endeavour to take it due to the scale of effort required, is to gather all of the available evidence together to establish a high-level estimate of the scale of the problem. What is the aggregate cost of school uniforms across the developing world? How many children are denied an education as a consequence of their family not being able to afford school uniforms? Ultimately building to the question, What if the requirement were abolished? Once we know the "size of the prize", and please do forgive me for that blatant consultant-ism, we can begin sizing up what can be done about it.

School uniforms in developing countries: An unnecessary evil? - Hypothesis

There are charities helping families in developing countries to buy school uniforms for their children so that they can attend school. This is a good thing, right? Which part? The part about charities helping families in developing countries or the part where this is even a problem? If what I consider to be an arbitrary policy is preventing impoverished children from getting a primary education, this is a great injustice.

Testing this with a few friends, I have concluded that this quite possibly is the case, and I also received some stark warnings about the social, cultural, and psychological dimensions to school uniforms. These warnings are certainly valid, but many great in justices in this world have been toppled that were held up by social, cultural, and psychological factors. The question is, how big is the problem, how big are the barriers, and are our efforts best placed elsewhere?

It occurred to me that this is an opportunity to try out some strategic modelling and analysis, something that I do often in my current work. I have already completed the first step of forming a hypothesis and testing with a few peers. To pursue the problem further I would take the following steps:
  1. Form a hypothesis:
      The unnecessary requirement for school uniforms in developing countries puts undue financial stress on families already struggling to afford basic necessities and/or tuition, and potentially even excludes some children from attendance.
  2. Test hypothesis at a high level
      Gather whatever evidence is at hand or easily available to sense-check and/or refine the hypothesis. Might the hypothesis be true? Is it likely enough to be true enough to warrant further investigation?
  3. Estimate the magnitude of the problem/scale of the potential benefits from taking action
      This will be much like a top-down strategic business case. The key focus will be "What if we could achieve a change?" without yet talking specifically about what actions would be required. Like the previous step, this is another gate we have to pass where we must be certain it is worthwhile proceeding. The output can also be an important number socially, as $x million lost per year or y thousand children excluded from primary education worldwide can be a useful catalyst for change as it is shared and repeated.
  4. Develop a portfolio of initiatives
      Preferably in a brainstorming/facilitated workshop environment, work with stakeholders and subject matter experts to generate potential initiatives or interventions to address the problem.
  5. Prioritize initiatives
      Estimate costs, benefits, and risks of each initiative and then build an action plan, selecting the highest benefit set of activities that fit within your budget or capacity while managing/minimizing risk. This is a classic Operations Research portfolio optimization knapsack problem, though in practice, problem sizes are small mathematics are rarely used.