Thursday, May 31, 2012

Consistent Education Divide in Cities

The Daily Viz brought this to my attention. It's a visual by the New York Times showing how the distribution of cities by proportion of adults with college degrees has changed over the last 40 years.

Nicely formatted and presented, though my ability to compare the distributions side-by-side is a little bit limited.

The key story that this visual is telling is that the average has moved from 12% to 32%, but that the number of cities more than 5% above or below the average has increased substantially. "College graduates are more unevenly distributed in the top 100 metropolitan areas now than they were four decades ago." But i'm not sure if it's as simple as that.

Suppose I was measuring trees. One species was 10 feet tall on average and species two was 100 feet tall. If the first tended to vary between 7 feet and 13 feet, but the latter tended to vary from 85 feet to 115 feet, I wouldn't remark at how much more variable these trees were. For species one, no tree was more than 3 feet from the average, but in species two, presumably many are. Is this a sign that species two is more unevenly distributed? Not really. Species one varies up and down by 30% where two does so by 15%.

So I asked myself, given that the average proportion of adults with college degrees has nearly tripled to 32%, has their variability increased proportionally? Now that these trees are 32 feet tall, it seems strange to still measure their "unevenness" by how many of them are between 27 and 37.

So I reached out to a statistic, the Coefficient of Variation. Using my eyes to collect the data from the charts (so not precisely the correct data), I calculate a coefficient of 0.25 in 1970 and 0.22 in 2010. The variation in the data as a proportion of the average has gone down in the last four decades.

Again, the NYT concludes that "College graduates are more unevenly distributed in the top 100 metropolitan areas now than they were four decades ago.", but I would argue that if anything they are slightly more evenly spread than before and not remarkably so.

Saturday, April 28, 2012

Data Journalism



I've recently started following the Guardian's Data Blog, but I was a little disappointed with their recent article on grammar schools inthe UK.

My understanding is that grammar schools are a subset of schools in the UK that supposedly offer entry on a meritocratic basis and deliver higher quality education. Depending on your political leanings you either believe that grammar schools re-enforce the class division in the UK by giving entry disproportionately to the already higher class and then giving them a better education or you believe that grammar schools enable class mobility by delivering a better education to bright lower class students who would not otherwise afford such a thing. As an outsider in the UK I’m not qualified to hold an opinion here, but I suspect that naturally each extreme fails to appreciate some nuanced details.

The article appears to have pulled off a classic journalist's ploy:
  1. Present a statistical analysis of the data in a leading way without drawing conclusions
  2. Quote somebody else's opinion on the topic

Essentially you can deliver opinion supported by the apparent full weight of objective statistical analysis without having to put your name to the conclusions which might not hold up to rigorous challenge.

Notice also that one of the opinions is much stronger than the other. Notice also that Rosemary Joyce's note has very complex implications which are not at all explored for the reader. Even I’m not sure if she has a point.

I could offer a very different view of the same data:
  •  14 of 32 schools favoured those not privately educated, giving fewer than 6% of offers to the privately educated
  • Taking 24 of the 32 schools (3/4) with the lowest privately educated proportions, the average was 6%, the same as the overall population
  • Removing the two clear outliers in the data, "Tonbridge Grammar School", "The Judd School" overall the privately educated averaged to 8.9%


I feel the key fact I’m missing is: What % of students in Kent who scored well on the 11-plus exam were privately educated? How does this compare to the 10.89%? How does this compare to the 8.9% removing outliers? Is there a social bias in the offers?

I’m also missing any information about how these numbers have been changing with time. Simon Murphy complains that the government is not taking steps to improve the chances of poor children, and yet for all I know that 10.89% was maybe 12% last year.

What about this “local context” anyhow? How do these percentages compare at a lower level of granularity that county-wide? How do these percentages compare to applications?

Is this a story of a county-wide bias, or just the story of two bad apples and handful of not-so-good-ones? I think I know what The Guardian wants me to think. Data Journalism is still Journalism I suppose.

For my readers, I ask, why do you suppose the 10.89% number is the only one in the text of the article to two decimal places?

Saturday, April 14, 2012

Thursday, March 29, 2012

My Jealous Supermarket II

Last week I wrote the Figure It Out article that was published to the Capgemini Consulting UK Operational Research team blog, My Jealous Supermarket.

I encourage to click through and read the article. To summarise, my supermarket is targeting me with discount coupons in order to maintain my loyalty which it mistakenly thinks it is losing because I am a travelling consultant and have shopped very little lately.

Anyhow, after returning from a week in the north of England followed by a weekend in Florence followed by another week in the north, it was immensely satisfying to see the "Spend £30, get £3 off" coupon roll off the receipt printer when I purchased ingredients for my meal this evening after making no purchases for two weeks. They DO care!

I wonder how this initiative is going for them? Are they successfully winning people back? Any proper initiative would have a benefits tracking element following implementation, but comparing before and after and asserting causality is always difficult. Consider myself. One day I will wrap up my project in the north and spend some time in London again. I will return to my supermarket and purchase lots of food. Success! After months of giving me coupons, they will have finally won back my favour and loyalty. Or not...

Monday, February 13, 2012

Numbers in 2011 - from More or Less podcast

One of my favourite podcasts is BBC's More or Less. At the start of 2012, they did a series on Numbers in 2011. I know it's a little late in sharing this, but here we go - enjoy.

I'm sharing with you a selection of the numbers from the 30min podcast. They are somewhat UK centric, but still worthwhile sharing.

Listen to the whole podcast here.

  1. 80%: developed world's debt to GDP ratio
  2. 1.37: cost of petro in GBP on 9 May 2011 (highest in 2011), due to duty, value added tax (20%) & exchange rate (weaker GBP against USD)
  3. 1%: BBALIBOR (interest to be paid in 3 months time) 10 Nov 2011 crossed 1%, doubling of the bank interest rate. BBALIBOR indicates the risk of money not being paid back in 3 months - a show of lower confidence/trust between banks.
  4. 2.64m: unemployment in UK by December 2011 (highest in 17 years). Note UK population is just over 62m.
  5. 900k: people today working beyond 65 years old in the UK
  6. 12,500: people celebrated their 100's birthday in 2011 in the UK; and will rise to 100,000 over the next 25 years
  7. 7bn: world population
  8. 2.5: average fertility of women on earth (babies per lifetime of earth, falling from 6 from 60 years ago), easing on the environment I suppose
  9. 3,000gbp: cost of sequencing the human genome; in 2003, the first sequencing of human genome cost 600m GBP - that's a 200,000 fold reduction in cost in 8 years
  10. 2 weeks: to sequence 5 human genomes in 2010; in 2003, it took 10 years for one

Tuesday, January 3, 2012

School uniforms in developing countries: An unnecessary evil? - High-level test

Earlier I wrote a post about the requirement for school uniforms in developing countries and how I saw this as a potentially offensive injustice. I completed the first step by forming my hypothesis, "The unnecessary requirement for school uniforms in developing countries puts undue financial stress on families already struggling to afford basic necessities and/or tuition, and potentially even excludes some children from attendance." Now I am looking to test that hypothesis quickly at a high level. I want to do some research to gain reasonable assurance that the hypothesis is correct before I might move on to establish the magnitude of the problem.

Schools for Africa is a UK Registered Charity mainly focused on building schools, but who also say: "£40 will buy 10 sets of primary school uniforms". To put this into perspective:
  • They also say: "£235 will buy 50 text books for the children to share". That's £4.70 per textbook vs. £4.00 per uniform.
  • £4 is about the same as an average day's wages in Ghana
  • £4 is about the same as an average week's wages in Ethiopia
  • I choose these countries as I visited them in 2011, but it is worth noting that Wikipedia reports school uniforms as required in Ghana
The folks at Project Ethiopia, an American 501(c)(3) have reportedly bought 1,695 school uniforms at $8 a piece. These uniforms are also said to last two years, so that's an annual cost of only $4. They make the relevant point that these uniforms are the only set of clothing for many, which would lower the additional burden of the uniform requirement on top of that for clothes. Note, however, that $8 is more than a week's wages as calculated above. Again for perspective:
  • They also claim to buy over library books for $3 a piece
  • They also claim to buy a years school supplies (5 exercise books, 1 pen, bar of soap) for $3
Gift Ethiopia, a UK Charity will provide an Ethiopian school uniform for £8, describing it as such:
Without a uniform, many children in Ethiopia are unable to attend school. Many families, especially larger ones, struggle to provide a uniform for all their children. These children are denied an education and the chance to socialise with children their own age. Your gift will provide a student with a brand new, full school uniform, ensuring they can take their place in the classroom with pride.
  • £8 for a school uniform is about the same as they say it will cost to provide a school dinner for over 10 weeks
The first program listed on the website for Common Threadz, a 501(c)(3) American non-profit, is "School Uniforms for Orphans & Vulnerable Children". They describe the problem:
For families facing the challenges of poverty in Africa, school clothes are not as crucial as the next meal. The direct costs of education, from a uniform and shoes to books and stationery, force millions of orphans and vulnerable children to miss out on school each year. For a child in need from a poor rural family who may only own one pair of old pants or a tattered dress, a school uniform is not just a requirement, but essential to build confidence and academic success.
World Vision UK runs MustHaveGifts, and sells a pretty smart looking Pakistani school uniform for £12.
  • The uniform is described thusly:
    • Pakistan: Children who can't afford a compulsory school uniform can be denied the right to an education, leaving them vulnerable to exploitation. With a school uniform, children can attend school for the very first time and get on the path to a brighter future.
  • At $2,500 USD per capita PPP GDP de-adjusted to remove PPP is $941 or £1.65 per day or almost £12 per week
Based on the above I think that we can conclude that there is reasonable evidence to suggest that in parts of the developing world school uniforms are comparatively expensive and a prerequisite to education.

The next step, though I may not endeavour to take it due to the scale of effort required, is to gather all of the available evidence together to establish a high-level estimate of the scale of the problem. What is the aggregate cost of school uniforms across the developing world? How many children are denied an education as a consequence of their family not being able to afford school uniforms? Ultimately building to the question, What if the requirement were abolished? Once we know the "size of the prize", and please do forgive me for that blatant consultant-ism, we can begin sizing up what can be done about it.

School uniforms in developing countries: An unnecessary evil? - Hypothesis

There are charities helping families in developing countries to buy school uniforms for their children so that they can attend school. This is a good thing, right? Which part? The part about charities helping families in developing countries or the part where this is even a problem? If what I consider to be an arbitrary policy is preventing impoverished children from getting a primary education, this is a great injustice.

Testing this with a few friends, I have concluded that this quite possibly is the case, and I also received some stark warnings about the social, cultural, and psychological dimensions to school uniforms. These warnings are certainly valid, but many great in justices in this world have been toppled that were held up by social, cultural, and psychological factors. The question is, how big is the problem, how big are the barriers, and are our efforts best placed elsewhere?

It occurred to me that this is an opportunity to try out some strategic modelling and analysis, something that I do often in my current work. I have already completed the first step of forming a hypothesis and testing with a few peers. To pursue the problem further I would take the following steps:
  1. Form a hypothesis:
      The unnecessary requirement for school uniforms in developing countries puts undue financial stress on families already struggling to afford basic necessities and/or tuition, and potentially even excludes some children from attendance.
  2. Test hypothesis at a high level
      Gather whatever evidence is at hand or easily available to sense-check and/or refine the hypothesis. Might the hypothesis be true? Is it likely enough to be true enough to warrant further investigation?
  3. Estimate the magnitude of the problem/scale of the potential benefits from taking action
      This will be much like a top-down strategic business case. The key focus will be "What if we could achieve a change?" without yet talking specifically about what actions would be required. Like the previous step, this is another gate we have to pass where we must be certain it is worthwhile proceeding. The output can also be an important number socially, as $x million lost per year or y thousand children excluded from primary education worldwide can be a useful catalyst for change as it is shared and repeated.
  4. Develop a portfolio of initiatives
      Preferably in a brainstorming/facilitated workshop environment, work with stakeholders and subject matter experts to generate potential initiatives or interventions to address the problem.
  5. Prioritize initiatives
      Estimate costs, benefits, and risks of each initiative and then build an action plan, selecting the highest benefit set of activities that fit within your budget or capacity while managing/minimizing risk. This is a classic Operations Research portfolio optimization knapsack problem, though in practice, problem sizes are small mathematics are rarely used.

Friday, December 30, 2011

Operational Research Consulting & Data Journalism

As data becomes more and more accessible, together with visualisation tools becoming more available and user friendly, Data Journalism is heating up. I've been following the Guardian's Data Blog enthusiastically, it is full of interesting information relevant to current affairs, explained with much facts and data.

This article talks about the 10 point guide to data journalism. I particularly like point 5:
Data journalism is 80% perspiration, 10% great idea, 10% output
The Prezi under point 5 explains the process of how data is used to support news, the angles to consider when mashing datasets together, the technical challenges of working with data, iterative calculation and QA process, which finally get turned into the beautiful output with the various (mostly free) visualisation tools.

This is practically the same process that an Operational Research consulting project takes - or any application of OR or Science in general:
  • Understand what the problem/question is
  • Create a hypothesis to be proven or disproved
  • Define what data is needed for the quest
  • Get the data
  • Clean it, and manipulate/wrangle with it so it's usable for analysis
  • Analyse/calculate to come to some conclusion - hence proving or disproving the hypothesis
  • Compare it to subject matter experts' view on what the likely answer should be (sanity check)
  • Refine the analysis until satisfied
  • Shape the output message so it can be easily understood by the audience
  • Communicate the findings
  • All throughout the process, keep communicating to the audience to make sure they are engaged and understand (principle-wise) what you're trying to do, so that they are not unpleasantly surprised when the final answer is presented
  • Best yet, to ensure smooth change management if your solution is to be implemented, work closely with the end users from the start of designing the solution, and then implement and test, so that they believe in the solution because they were part of the creation process.
As the Flowing Data blog points out, this is what statisticians do. I will add that this is what Science does in general. I will also say that in practice, the first step, "understanding what the problem/question is", often takes 70-80% of the time. The technical 'doing' to follow, in practice, is relatively easy compared to what our academic institutions thoroughly prepare us for (which is needed).

For those interested in the how of data journalism, read this about the work that went into reporting on the 2011 London Riots. Fascinating social media analytics at work. Not easy. Impressive and very interdisciplinary.

P.S. Most of this post has been sitting as draft since the summer, hence referencing 'old' news. It's still relevant, so why not.

Sunday, August 14, 2011

Operational Research considered 1 of 6 dsciplines in Social Sciences

Okay, so OR is grouped with Statistics as one of the six disciplines of social sciences, but still, I'm pleasantly surprised that OR is mentioned!

According to QS World University Rankings, the six disciplines considered as part of social sciences are:
  • Finance
  • Economics and Econometrics
  • Law
  • Politics and International Relations
  • Sociology
  • Statistics and Operational Research
Here you can download the full table (yeah, Google Doc!), and see the top 10 universities at a glance for each of the above subjects. For Stats and OR, here are your top 10:

Rank Institution Country
1 Stanford University United States
2 Harvard University United States
3 University of California, Berkeley (UCB) United States
4 University of Cambridge United Kingdom
5 Massachusetts Institute of Technology (MIT) United States
6 University of Oxford United Kingdom
7 National University of Singapore (NUS) Singapore
8 University of Toronto Canada
9 Imperial College London United Kingdom
10 Princeton University United States
P.S. If you haven't discovered it already, the Guardian's Data Blog is great!

Sunday, July 31, 2011

An Alternative Way to Fly (as long as expectations are managed)

The purpose of this post is to share the discovery of an alternative way of operating an airline (flight schedule and route wise).



No matter how airlines degrade their service standards these days in the West, I think it's fair to say that most of us still believe that most airlines *intend* to:
  • Take off on-time
  • Land on-time
  • Fly us from A to B as the ticket says, without surprise stops
  • (Oh, and have toilets, of course)

On a recent trip to Ethiopia, we have been shown a rather different way of operating an airline. It contradicts with all of the above, but it works. We took 4 internal flights.

Here is how we experienced them first hand:
  • 1 left on time as per the ticket, and even got us there early (bonus!), because...
  • None of the 4 flights flew the original path it said it would: stopovers were skipped to go direct instead, or the direct flights got stopovers added onto it last minute
  • None of them arrived late, because...
  • Some of them took off earlier than stated
  • Additionally, the air stewardesses were lovely, and they gave passengers snacks and drinks (*gasp* what novelty!)
  • To their credit, they did try to inform passengers of the changes a couple of days ahead of the flight (in our case by email, which we only read after we got back to London).
  • They also tell passengers to double check the flight times a couple of days before, to be aware of any late changes.
(For your curiosity: the international flights from London to Addis Ababa was quite standard. The only oddity was that they weighed everyone's carry-on luggage at the gate, because it's apparently a popular flight to take lots of stuff with you!)

IMHO, an airline would play this game, because: (we suspect - unconfirmed)
  • It wants to minimise costs - mainly fuel in this case.
  • It has 1-2 planes that fly in circles to cover off a handful of popular destinations.
  • As the airline gets more and more requests for seats through the form of purchased tickets, it is faced with an optimisation problem to fly all its customers to their expressed destinations with minimum cost. The best way to do this is probably through re-shuffling the schedule. For instance, if a plane is hopping from A to B to C in sequence, where B is closer to A than C is, and if we discover 2 days before the flight that the plane is filled with 2/3 passengers going to C, and 1/3 going to B, then flying A->C->B is cheaper than A->B->C. What if there are customers wishing to go from B to C? We hear that the airline is known for cancelling flights as well. Luckily, we didn't experience this.
This way of operating an airline is possible, because:
  • It is a monopoly.
  • The number of flights are few, so it's easy to manage change.
  • Customers expect it and adjust flying behaviour accordingly (i.e. always check the flight times before the day of flight, and always leave wiggle room before and after the flight).
  • For foreigners who are used to the typical western airline service (i.e. expect it to take-off and land on-time and fly the route it says it would), the price justifies it and shuts people up from complaining, and instead people will have a laugh (or write a blog post!) about it.
  • It doesn't call itself "Precision Airline" (the Tanzanian airline), and can afford to deviate a little. 8-)
P.S. If you are planning to visit Ethiopia, and intend to fly within the country, you may want to consider buying the tickets within the country rather than online. It is significantly cheaper due to price control. This is true as of spring 2011, so double check this before you travel.