Saturday, April 28, 2012

Data Journalism

I've recently started following the Guardian's Data Blog, but I was a little disappointed with their recent article on grammar schools inthe UK.

My understanding is that grammar schools are a subset of schools in the UK that supposedly offer entry on a meritocratic basis and deliver higher quality education. Depending on your political leanings you either believe that grammar schools re-enforce the class division in the UK by giving entry disproportionately to the already higher class and then giving them a better education or you believe that grammar schools enable class mobility by delivering a better education to bright lower class students who would not otherwise afford such a thing. As an outsider in the UK I’m not qualified to hold an opinion here, but I suspect that naturally each extreme fails to appreciate some nuanced details.

The article appears to have pulled off a classic journalist's ploy:
  1. Present a statistical analysis of the data in a leading way without drawing conclusions
  2. Quote somebody else's opinion on the topic

Essentially you can deliver opinion supported by the apparent full weight of objective statistical analysis without having to put your name to the conclusions which might not hold up to rigorous challenge.

Notice also that one of the opinions is much stronger than the other. Notice also that Rosemary Joyce's note has very complex implications which are not at all explored for the reader. Even I’m not sure if she has a point.

I could offer a very different view of the same data:
  •  14 of 32 schools favoured those not privately educated, giving fewer than 6% of offers to the privately educated
  • Taking 24 of the 32 schools (3/4) with the lowest privately educated proportions, the average was 6%, the same as the overall population
  • Removing the two clear outliers in the data, "Tonbridge Grammar School", "The Judd School" overall the privately educated averaged to 8.9%

I feel the key fact I’m missing is: What % of students in Kent who scored well on the 11-plus exam were privately educated? How does this compare to the 10.89%? How does this compare to the 8.9% removing outliers? Is there a social bias in the offers?

I’m also missing any information about how these numbers have been changing with time. Simon Murphy complains that the government is not taking steps to improve the chances of poor children, and yet for all I know that 10.89% was maybe 12% last year.

What about this “local context” anyhow? How do these percentages compare at a lower level of granularity that county-wide? How do these percentages compare to applications?

Is this a story of a county-wide bias, or just the story of two bad apples and handful of not-so-good-ones? I think I know what The Guardian wants me to think. Data Journalism is still Journalism I suppose.

For my readers, I ask, why do you suppose the 10.89% number is the only one in the text of the article to two decimal places?