17 March 2013

The Data Delusion: On average, it’s a bit more complicated.

Filed in Leadership, Teaching and Learning

Increasingly I am becoming frustrated by the lack of sophistication that is applied to the whole process of evaluating educational outcomes.  As a consequence, all kinds of perverse and spurious conclusions are drawn and school, teachers and policy makers end up jumping through hoops that…

The Data Delusion: On average, it’s a bit more complicated.

With apologies to Richard Dawkins...

With apologies to Richard Dawkins…

Increasingly I am becoming frustrated by the lack of sophistication that is applied to the whole process of evaluating educational outcomes.  As a consequence, all kinds of perverse and spurious conclusions are drawn and school, teachers and policy makers end up jumping through hoops that have no real basis.  If we’re not careful, we’re going to lose sight of what matters….if we haven’t done so already.

I will try to illustrate the point… always conscious that inevitably I will be over-simplifying, so please bear that in mind.

There are two major issues with the measurement of educational outcomes:

  1. The things we are measuring – knowledge, skills and understanding  – are essentially, for the most part, intangible, ephemeral and invisible; brains are very complicated and we don’t really understand them.  As Dylan Wiliam is fond  of saying: “Learning isn’t rocket science; it is much more complicated than that”.   We often fall into the trap of assuming that our measurements capture the extent of learning, thus limiting our view of what learning is to that that is measurable.  We are simply not good (on average!) at dealing with learning that is beyond the scope of our measurement tools.
  2. Everything we do in schools, everything that constitutes learning, is subject to our values system.  What we value in terms of learning outcomes is not absolute and as humans with different world views, living in a democracy, we have to work hard to  arrive at a consensus about what matters.   Pythagoras’ theorem may be one of the universal truths of space but whether it matters is something we decide; knowing it and being able to use it are different and, again subject to our values.

So, in this context of complexity, naturally enough, we try to create order.  It is sensible enough to agree on a curriculum defining things that should be known and understood (whether the Government should decide this or not is another issue.) On the micro-scale of simple questions, it is meaningful to assess learning against the curriculum objectives:

  •  An understanding that momentum is conserved in collisions
  • The ability to spell ‘disaggregate’ correctly
  • The ability to write the symbol or word equation for Photosynthesis and use it to explain various features of plant growth.
  • An awareness of the key events of World War II
  • The ability to understand “Fortiter ex animo” or “Wir könnten Bowlen gehen”

More subtly, we can assess complex accretions of learning

  •  The quality of writing in an essay; whether it is coherent and makes a good argument
  • The success of an evaluation of multiple causal factors and their inter-relationship in determining the key reasons for a certain outcome (in any subject)
  • The skill and originality in producing a composition.

From questions to basic tests and assessment tasks right up to long exams and extended pieces of work, we do what we can to make sense of what has been learned and to give it value.   This is nuanced; every answer to ‘why does your heart beat faster during exercise?’ is different – even if there is an objective truth.  Try it! My Y10 daughter complained the other day that in English Literature ‘they say there is no right answer…. but there always is!’ The truth-values interplay is an everyday experience for learners and teachers.

However, – here is the point at which we start to lose meaning –  in seeking to capture the essence of our assessment in order to communicate and record it,  we continually attempt to make something complicated, very simple and we turn real meaning into a code: data.  Superficially this is innocuous but ultimately, unless we’re very careful… all manner of distortions arise.  Another Dylan Wiliam quote: “A man with one foot in boiling water and another in freezing water is not, on average, comfortable”.

 14/20 or 65% on a maths test:  Already, we’re losing sight of what was learned, settling for an overview.  Two students can get the same score – with completely different wrong answers; same score – entirely different learning.   Obviously 80% on a ‘hard test’ is better than 80% on an ‘easy test’… so we compare with others; what starts as a record of success in gaining correct answers, becomes a statement of relative performance, leading to the bell-curve approach.  In truth a lot of assessment is relative; not absolute.

4/6 for a question or 23/30 for an essay: Turning a set of ideas into words is a messy process; ascribing a scale to that is messier – so we need criteria.  A 6 mark answer is hard enough to define relative to 5 marks; 23/30 is basically meaningless unless we can separate 23 from 22 or 24 with some consistency.  Moderation meetings for essay-based subjects are interesting!  AQA markers for English A level papers can differ by 30 marks out of 80 in their assessments.  What does it all mean??

Level  6c on a single piece of work or a Y8 report.   National Curriculum levels were designed and defined as a set of attainment statements related to a whole key stage.  In science, how magnesium reacts with oxygen without mass being lost is a piece of knowledge a student might learn.  There is no sense of any kind that this can be ascribed a level on a par with some other bit of knowledge. None.  The assumption that the depth of learning goes up in linear steps  or that the steps are of equal size within one subject is a fabrication to create the illusion of progress over time. To assume that the levels have parity across subject disciplines is also pure delusion.  It is literally without meaning.   And yet… Y8s across the country are being told they are at Level 5a and need to progress to 6c.. We’ve made it all up.  It just means – learn more; go deeper; express it in a more sophisticated manner.  The level ladders are a super-crude code that is divorced from real learning where the biggest variable by far is the teacher’s interpretation.  At my school we’ve devised our own system that makes sense for us; we didn’t feel we could play the levels game.

 70% making ‘three levels progress’ : Despite the house of cards of Levels, ‘Levels of progress’ has become a key OfSTED measure.   It may well be that moving from L3 to L5 is harder than L4 to L7 in some areas of learning; factor in the assessment error and you have a measure from one massive averaged uncertainty to another massive averaged uncertainty.  A statement like ’75 % of students made 3 levels progress’ tells you almost nothing about the learning that has taken place or how good the school is.  We are projecting meaning on to something that isn’t there….. we really are.

Norm-referencing. Like it or not, this is what grades mean.

Norm-referencing. Like it or not, this is what grades mean.

Grade B on a piece of work: an essay, a painting, a science investigation; in an exam. I had a discussion with a teacher about why he gave B+/A- for essays.  In his head, this was consistent… a B+ is definitely not an A- and he would give the same grades consistently. I have every reason to be very dubious about this….  Grading is very clearly not an objective, absolute process. Grading only has meaning in reference to the cohort – the dreaded norm referencing.  If you think you can define a grade with some criteria that can be tested accurately, you’re doing better than any exam board and most teachers.  I’ve devised tons of tests. We give scores and %s and then allocate grades.  How? By seeing how the mean and ranges of scores compare with other data sets.    Test scores might range from 10/100 to 90/100.. Here you can see that dividing up into grade regions might work.  But, when the range is 65/100 to 75/100…. How meaningful is it to say the students performed at different enough levels to warrant different grades?  Well, this happens all the time.. A recent GCSE PE exam at my school gave A-D grades for scores from 62/80 down to 56/80.

The grade boundary cliff is another issue.  For any exam,  in UMS terms, 70 might be an A, and 69 a B; here the difference in learning is marginal – zero in all reality within the limits of accuracy– but the boundary wall gives massive un-founded significance to a hair’s breadth on an artificial scale.  And look what happened to English GCSEs last summer.  Catastrophic gerrymandering – bursting the bubble of hope people had created that grades were about objective standards… instead of rank order. Well now we all know.

All the stuff about ‘working-at grades’ is also highly dubious.  In GCSE Physics, my subject, there is no sense in which a student starts at grade C, moves up to B and eventually reaches A.  The grades are based on norm-referenced bell curve analysis of overall grades in final exams.  At no point is there a C grade until the end… All I can do is evaluate whether they are on the path towards an A…but they are never at C or B.  Again, this is artificial and needs to be seen as such.

 3As, 5Bs and 2Cs for a student.:  Next, we aggregate all of this up:  we turn scores into UMS into Grades for various exams giving a student an overall set of grades.  What does this tell you about what they learned or what they can do? Very little – except in reference to how the testing system  with all its statistical distortion factors and errors compared them to everyone else.  I think it is ironic that content/knowledge purists also often advocate a testing regime that produces an output that actually only gives you a general overall sense of what a student’s general learning capabilities might be within the specific parameters of a test; ie it doesn’t tell you anything about what they know or can do.

60% 5A*-C for a whole school.  Continuing up the chain, we end up defining an entire school – all learners, all learning, everything… in a single pieces of data.  60% 5A*-C averages out everything we know about learning to the point of oblivion.  A school were 60% of students got exactly 5Cs looks similar to one with 60% of students gaining a mixture of A*-Cs including lots of A*s from a mixed intake.  Most recently, OfSTED squeezed every last drop of meaning out of the whole edifice by putting schools into banks of ‘similar schools’ and ranking them into Quintiles on the Data Dashboard.  Here schools with broadly similar %5A*-C scores (a few % apart) can be ‘top quintile’ and ‘bottom quintile’.  Here, the plot has been utterly lost. There is no reproducible, meaningful sense in which schools’ outcomes can be processed in this way and convey a sense of the quality of learning or the overall educational experience.

 A value added score of 986.7 or 1016.3.  In science, we teach students about measurement: accuracy, precision, resolution, reproducibility.. and so on.  It is standard practice to evaluate errors in measurement and to take care not to over-state the precision in a final result, relative to the size of errors.  For example, if your stop-clock only measures in seconds you can’t say the time for a feather to drop is 8.63 seconds. If you measure 100 drops, the calculator may tell you the average is  8.63 seconds but your apparatus is not up to the job of giving you that level of precision.  If the reaction time suggests an error of a 1 second, the best you could hope is for an answer of say 9 seconds +/- 1 second.   But.. do the DFE and OfSTED understand this? No they do not.  The VA algorithm is deeply flawed.  School A: VA = 995.2 +/- 11.6  School B:VA = 1002.1 +/- 13.7  (not uncommon)  We are expected to believe School B adds more value than School A… but the errors suggest we cannot make that claim.  Data garbage presented as truth.

 Effect Sizes  of 0.29, 0.65 and 0. 84  My final bit of Data Delusion is the new and growing search for reproducible and reliable educational outcomes from research.  Hattie and Petty, amongst others, have done work in this area and, for me, the outcomes are interesting.  Over many studies, the rank order in effect sizes (derived from standard deviation calculations) leads to a set of high-impact strategies that ring true for me with my subjective bias and values.  I should be really pleased.  But, unfortunately, there is already an overwhelming tendency for people to take these figures at face value.  To begin with their figures are an average. If an effect size is 0.65, no single study may have yielded that outcome; the range may have been 0.1 to 1.2..as the contexts shifted and changed.  Then, the level of precision implied by the second decimal place gives the impression that a 0.65 effect is somehow meaningfully higher than a 0.62 effect – and I have heard people take the surface rank order as gospel truth.  Of course, these are historical, retrospective averages.  They tell you nothing about what might happen in any specific context beyond this:  some initiatives, on average, in the past, have been shown statistically in the particular tests that were done, to exhibit this general pattern.  It might , therefore, be worth looking at these strategies so see if they also work well in your context.    

As I have shared in this post – what Hattie says about homework is complex. And yet, even intelligent people will tell me that 0.29 is a low effect size, therefore homework is a bad strategy.  It makes me weep…..

What we need is an intelligent view of assessment that takes account of the distortions inherent in any measurement process; that is capable of embracing the idea of ‘error’ and that does more to link our assessments to the original learning.   Teachers, leaders, inspectors and politicians need to avoid placing high value on data in ways that cannot be sustained.. Learning is fuzzy; it is complex…..let’s embrace that and not reduce it to something where all meaning has been lost.  It is like dropping a bag of marbles. Even when we know, in physics terms, all the laws that determine the motion of a dropped marble, there are so many variables at play that we cannot predict how the marbles will fall and roll. We can see a pattern, we can look at limits.. but we can’t describe the detail.  If we look at the final resting places of a bag of dropped marbles, similarly, we can’t extrapolate backward to know exactly how they got there.  It this is true for a simple bag of marbles… for learning, it is even more complex.  Let’s recognise that.

To finish:  An unrepresentative anecdotal cautionary note: Another dodgy data dimension is around the process of judging lessons and schools through OfSTED inspections and lesson observations.  I know an inspector who is quite happy to tell teachers their lesson is Good (not Outstanding)…because of the quality of learning he observes and the levels of progress made.  This man is a creationist;  he thinks living things were placed on Earth by a higher being and says ‘I don’t agree with all that evolution stuff’.  Nevermind the evidence. If he comes to your school, you’re in serious trouble!

No Comments
  • Jim smith
    Posted at 16:34h, 17 March

    Brilliant stuff. Totally agree with the madness of data and its inconsistency. Know you students and teach well and try will responded simple.

  • Steve Philp (@frogphilp)
    Posted at 18:31h, 17 March

    This is one of those posts where I’m with you… I’m with you… I’m with you… Now you’ve lost me – more down to my brain not functioning than anything to do with the post, but as you say it is complex.

    For me the mess of data in education is made worse by how badly we deal with things that we could actually control. We can’t completely control student performance and yet schools are held accountable for just that. Owen Nelton explains that rather eloquently here: http://matheminutes.blogspot.co.uk/2012/11/the-teachers-dilemma.html

    We could deal with the accountability structures by abolishing league tables which only cause gaming. But we don’t.

    And as for your Ofsted anecdote, I’ll take your Creationist Inspector and raise you a Welsh one: http://frogphilp.com/blog/?p=1224

    • headguruteacher
      Posted at 19:01h, 17 March

      Great story. That’s how these things should be.. the inspectors taking an intelligent view of things. I’ve got a blog in the pipeline about the accountability processes I’d like to see.. basically we need people who know schools well.

  • Syed Ashrafulla
    Posted at 19:14h, 17 March

    All of these criticisms are not only well-founded; they are broad enough to apply not just to education. Averages do remove a lot of the detail found in sets of data. Standard errors are required whenever differences in performance are discussed. Quantification of qualitative properties is perfectly impossible.

    That being said, these criticisms have no merit in public policy, because they disallow any analysis of teaching. The only way to satisfy these complaints is to not analyze the policy (in this case, education) at all. That alternative is far worse than just admitting the issues but still presenting the analysis.

    Using your value-added score example, if one does find a statistically significant difference (which they will in many cases), then that is actionable evidence. The other criticisms are valid only as caveats, but do not serve to falsify the evidence that one school is performing significantly worse.

    • headguruteacher
      Posted at 19:28h, 17 March

      Thanks Syed. What I am after is a subtle, intelligent and sophisticated used of data. Some schools/teachers/learners are better than others – that is uncontroversial; let’s accept that this is true. The information we are presented needs to reflect the full range of outputs, with errors and limits to significance fully laid out. There are solutions to grade boundary cliffs, to nonsense standard NC levels and to value added measures… but they require detail. If we’re going to simplify ‘so people can understand’ or some other argument, we need to be very careful… distortions abound. How do you know a school is worse than another – that the learning is worse… in truth, we don’t really know exactly; we have some clues and those have limits.

  • daja57
    Posted at 19:52h, 17 March

    I never have understood why we don’t just publish UMS scores instead of converting them into grades. I know it won’t solve all the other problems about unreliable data but it is at least one relatively simple quick fix.

    • headguruteacher
      Posted at 20:42h, 17 March

      Yup. Agree. The silliest part is when UMS goes to grades and then grades are given a different points score eg A = 52. B = 46… So neighbouring UMS scores at the boundary become separated by 6 points.. nuts!

  • Mike Gunn
    Posted at 19:55h, 17 March

    Agree with everything you’ve said about statistical data here, Tom, but there’s an even higher, more insidious level of rubbish data we have to contend with that goes “40% of all schools in UK fail our children”. Tis is the crude manipulation of spurious data by the media which then use it as a political lobbying tool for “more rigour” etc etc, which politicians respond to even more quickly than the babble of data you’ve outlined above. And does any of it really show you what our students can do in real life or who they are as people or what potential they have to change our society?
    Now that’s the sort of data I’d be interested in seeing. If only it were calculable…

  • Matt Bradshaw (@weRhistory)
    Posted at 20:11h, 17 March

    You articulate brilliantly why data is delusional. What is even more dangerous is that the people that ought to listen to this wont because it is inconvenient… data gives a veneer of simplicity and convenience to a disordered and complex world. It saves time and (in the current climate) a lot of money.

    • headguruteacher
      Posted at 20:44h, 17 March

      Absolutely.. it is a kind of unspeakable truth! We’re all so deeply conditioned to accept it.. it will take a massive shift to move to a more organic, nuanced notion of attainment.

  • mrashley37
    Posted at 22:18h, 17 March

    Thoroughly enjoyable read. From the general message to the “more accurate (Or should that be precise?) if it’s got a decimal point”

    Every member of SLT in every school should understand this.

  • Tim Eaglestone
    Posted at 23:12h, 17 March

    I think you are spot on with this. National Curriculum sublevels for me exemplify the mess we have got in with crude measurement and accountability, and for whom? Politicians who need either some way justifying policy or a stick with which to bash opponents. And once all this nonsense is aggregated we have a meaningless set of numbers that we pretend tells us something about the difference we are making.

    School leaders feel the pressure most acutely and often pass that pressure on down the line. So we have collections three times a year (or more frequently) that must show progress towards a ‘target’ based on last years’ national cohort. We confuse the micro with the macro.

    More insidious than that is the culture and climate it creates of measurement and judgement. We constantly feel measured and judged along with the students. This distracts from teaching, feedback and learning which should be where most of our energy is spent . And as you say ‘learning is fuzzy’ so it can’t be lost in a box-and-whisker plot.

    Of course we need to measure progress and I believe that schools should be accountable to the local communities which they serve, but the current data does not provide that. Patterns emerging from large sets of data can help us in seeing groups we are serving well and groups we are letting down: if enough marbles repeatedly drop in the same way then that is of interest, that is something we can observe. But we need to be very careful about what we measure to explain that observation.

  • Andrew
    Posted at 23:31h, 17 March

    Brilliant. I shall keep this by my side.

  • behrfacts
    Posted at 09:01h, 18 March

    Another thought provoking blog Tom. While looking at educational outcomes issues as an independent knowledge broker, I’m also learning from how my daughter is graded by her secondary school across subjects using NC sub-levels. As an informed parent what I appreciate about this information is that she has been set targets at the start of the year and I get an idea of how she is progressing against them, and how this compares with the rest of the cohort. That is probably enough for triangulation of attainment purposes. But what we both look at closely is the effort column. This uses a very simple grade mechanism, but there is no purpose in analysing it or collating it – which suits me fine as it is her teachers’ judgement of her personal qualities in particular subjects – if I get a genuine sense that this is being applied unfairly then I will of course investigate further. Finally, the school makes clear that you can’t compare NC levels across subjects, which should probably be written in bold capitals so that all parents take note. I worry that the proposed secondary accountability system of (progression in) grade point averages across 8 capped subjects, which sounds great for removing the C/D focus, may cause confusion amongst most parents who won’t understand the nuances between attainment in English, Maths, 3 other E-Bac subjects (the sciences could be fun!), and 3 other non E-Bac subjects. I look forward to your next post about this.

  • Noel Jenkins (@noeljenkins)
    Posted at 22:25h, 19 March

    Tom – you imply that you have developed a different system of KS3 assessment to the “levels game” Hope you’ll elaborate on this in a forthcoming post.

  • 2D & 3D SOLO: exploring new dimensions | meridianvale
    Posted at 22:37h, 20 March

    […] Sherrington @headguruteacher in this post ponders whether there are more appropriate measures of student progress than the artificial NC […]

  • Data Delusion Solutions Part 1 | headguruteacher
    Posted at 14:22h, 22 March

    […] the my last Data Delusion post, I’ve had an interesting response in three […]

  • D Haigh
    Posted at 13:45h, 26 March

    You’re right about the “value added score of 986.7 or 1016.3.” issue, and people who want to know a bit more should read http://www.ofsted.gov.uk/resources/using-data-improving-schools where David Jesson explains exactly this issue on behalf of Ofsted. However, if you’re below 1000 every year for five years in a row, the maths is a bit different. It’s not dead easy to calculate but the chances that there is a genuine difference rather than just normal variation are much higher when it’s repeated year after year.

    • headguruteacher
      Posted at 14:07h, 26 March

      Thanks for the comment. I agree that all relative measures become more meaningful if sustained over time. However, fundamentally, the VA scale is highly artificial. I’m not even sure that the scale is consistent year on year; it certainly isn’t linear. Eg 1000, 1010, 1020 are not equidistant in a meaningful sense. At least DFE gives the confidence limits; perhaps they should include these for grades too!? That’s where the greatest uncertainties lie.

      • Jon Clarke
        Posted at 09:42h, 08 April

        [2nd attempt at posting – the editor made garbage of some of my punctuation… sorry]

        Hi – coming a bit late to the party…

        Thanks for the blog in general, and this post in particular. You’re highlighting limitations of this field of study that have gradually been dawning on me for a while.

        I disagree with part of your conclusion here:
        > School A: VA = 995.2 +/- 11.6 School B:VA = 1002.1 +/- 13.7 (not uncommon) We are expected to believe School B adds more value than School A… But the errors suggest we cannot make that claim.

        Surely there should be some celebration here that an estimate of error has been published for this? So no, we are not expected to believe that School B adds more value than School A. Arguably, that would have been the conclusion if confidence intervals had not been published – but they were. Of course, the value-added itself still has further problems, but I’m not sure you’ve picked a particularly valuable example to make your point there. What it highlights is that lots of people reading those stats might choose to ignore the confidence intervals – that doesn’t make the stats themselves invalid, just those readers’ statistical awareness.

        • headguruteacher
          Posted at 13:29h, 08 April

          Hi Jon. I agree – it is good that the confidence intervals are shown. However, the degree of accuracy suggested by the decimal place figures is dubious – untenable. The BBC tables allow sorting by VA figures – without the conf intervals.

  • Accountability We Can Trust | headguruteacher
    Posted at 14:49h, 27 March

    […] the validity of the measures as indicators of learning or the quality of education overall.(see The Data Delusion). Despite this, OfSTED judgements are heavily driven by data analysis, reinforced only by snap-shot […]

  • Educational Lab Rats: The Search for Evidence | headguruteacher
    Posted at 07:40h, 06 April

    […] reproducible, how are we going to use research methodologies to the greatest effect? As I argue in The Data Delusion even physical systems that appear simple (like dropping a bag of marbles) are actually too complex […]

  • Progress in my classroom? How it is made and how I know it
    Posted at 17:05h, 13 April

    […] We know giving a number or grade when marking negates any comments given.  We have ridiculous situations where a student is graded a 5a for once piece of work and then told they are a 5c six weeks later after another. You can guarantee than any observer in you class will ask the students for the level they think they are currently working at. And when asked what they need to do to progress, the student had better give answer based on progression through the sub-levels. @headguruteacher goes into more detail about the mess we are in with data with his post The Data Delusion. […]

  • theback71
    Posted at 12:12h, 22 April

    As someone hoping to step on board the Headship ship sometime soon, this is an interesting post. It’s interesting anyway but particularly so for me right now. We’ve just interviewed for a new head in our current school. I sat in on the presentations of all candidates – given the title ‘the vision for ***’ – the one who most impressed our LA advisers and ultimately went on to secure job? The one who put the simple answer ‘ to get the school to outstanding’ and how was he to do this? through use of data and quality of t&l. After this process, I talked to LA adviser about an upcoming presentation – they said it’s all about the data; everything you mentioned in this post, all its’ negatives, irrelevancies and confusions, THAT’S what they want to hear. I’m not sure where that leaves those of us who agree with you but need to ‘collude’ in order to advance? Perhaps I won’t get on board after all….

    • headguruteacher
      Posted at 15:25h, 25 April

      Sadly, that is reality of appointments and I have suffered in a similar way. All parties repeat the data mantra because it is the only tangible thing they have. However, it is possible to use data sensibly, in perspective. So, it seems the thing to do is to take the data talk seriously so that people know you understand it… but, in post, to make sure it really isn’t given more weight than it warrants.

  • Accountability we can trust
    Posted at 21:29h, 24 April

    […] the validity of the measures as indicators of learning or the quality of education overall.(see The Data Delusion). Despite this, OfSTED judgements are heavily driven by data analysis, reinforced only by snap-shot […]

  • Fran
    Posted at 09:48h, 29 May

    Tom, is there any way I could give you a quick ring this week? I have a question regarding your recent blogs – it relates to an interview which is coming up very soon…! I’m at work this week on 01603 610993, if you are around. I would be happy to phone you back, rather than run up your phone bill! If not, no worries. I shall now print off your blog and staple it to my notice board… Fran

    • headguruteacher
      Posted at 14:01h, 30 May

      Hi Fran.. Ring 02082920801. I’ll do my best to answer.


  • Assessment without levels | Teaching: Leading Learning
    Posted at 22:36h, 15 June

    […] Sherrington wrote brilliantly about The Data Delusion back in March, describing how the original conception of National Curriculum levels was corrupted […]

  • My Blog Manifesto | headguruteacher
    Posted at 23:13h, 09 July

    […] Data Delusion, Data Delusion […]

  • My blog manifesto
    Posted at 07:47h, 11 July

    […] See Data Delusion, Data Delusion Solutions […]

  • Exam Reform. Another blog manifesto. | headguruteacher
    Posted at 21:32h, 24 July

    […] The Data Delusion: On average, it’s a bit more complicated […]

  • Exam reform. Another blog manifesto
    Posted at 18:25h, 14 August

    […] The Data Delusion: On average, it’s a bit more complicated […]

  • Assessment in the new National Curriculum – what we’re doing | Teaching: Leading Learning
    Posted at 19:47h, 05 October

    […] I could see the sense in this. However, as Tom Sherrington (@headguruteacher) points out in The Data Delusion it’s a lot more complicated than that. In Languages, for example levels are traditionally […]

  • KS2, KS4, Level 6 and Progress 8 – who do we appreciate? | Teaching: Leading Learning
    Posted at 23:06h, 18 November

    […] Sherrington (@headguruteacher) explains in The Data Delusion how the assessment regimes on which we depend for accountability are a house of cards with very […]

  • “Learning is not rocket Science, it’s a lot more complicated!” | "Knowing how you know"
    Posted at 00:47h, 13 December

    […] more on the headguruteacher […]

  • “Learning is not rocket Science, it’s a lot more complicated!” | ToKnowledge
    Posted at 00:49h, 13 December

    […] See more on the headguruteacher blog […]

  • tonyparkin
    Posted at 11:04h, 13 December

    Excellent stuff. Far too much reliance is placed on dubious data, and far too little attention paid to the inaccuracies and flaws in the assessment processes.

    One of the key pieces of learning I took from my PGCE assessment module was that the year I took A level physics, analysis showed that a candidate awarded a C grade by my exam board had an equal statistical probability of getting a B or a D, given the tightness of the grade boundaries and the marking/moderation error. In the days when a B got you a place in university Physics courses and a D didn’t. The #GCSEfiasco of 2011 helped reinforce this learning.

    Roll on the day when all exam results are required to be published as a numerical mark WITH the +/- standard error. And someone works out a way to accurately assess the reliability of those Ofsted grades 🙂 Then people may become a little more circumspect.

    • headguruteacher
      Posted at 14:57h, 13 December

      Thanks Tony. Roll on the day – I agree with that. 🙂

  • Taking Stock of the Education Agenda Part 2 | headguruteacher
    Posted at 21:53h, 19 December

    […] measure of all time has also gone.  Schools have been put to the sword on that sandcastle of data delusion for too long.  The new measure uses comparisons explicitly drawn from comparing national profiles […]

  • The Data Delusion: On average, it's a bit more ...
    Posted at 15:04h, 24 May

    […] Increasingly I am becoming frustrated by the lack of sophistication that is applied to the whole process of evaluating educational outcomes. As a consequence, all kinds of perverse and spurious co…  […]

  • The Data Delusion: On average, it’s a bit more complicated. | Leadership of Learning
    Posted at 20:27h, 17 July

    […] The Data Delusion: On average, it’s a bit more complicated.. […]

  • What did you do wrong today? | The Echo Chamber
    Posted at 15:52h, 10 March

    […] during the week and then, this morning, Tom Sherrington (@headguruteacher) RT’d his piece, The Data Delusion, from two years ago. A few minutes later, by a coincidence of Hardyesque proportion, John […]

  • Progress 8: Looks like Data Garbage to me. | headguruteacher
    Posted at 16:36h, 02 May

    […] that is derived from the raw scores on two tests in different subjects.  If you read my posts The Data Delusion or The Assessment Uncertainty Principle, you will see how far we move away from understanding […]

  • Diagnostic Date: the importance of the Little Data | historioblography
    Posted at 12:28h, 14 November

    […] Tom Sherrington has outlined ‘we continually attempt to make something complicated, very simple and we turn […]

  • J
    Posted at 20:42h, 05 December

    Jobs are created and large salaries paid for the management of this alchemy. It’s madness but defines our jobs.

  • The Data Conclusion Confusion | @LeadingLearner
    Posted at 07:01h, 13 December

    […] March 2013 Tom Sherrington wrote what I think of as one of his most iconic blogs, The Data Delusion.  He concluded, “On average it is a bit more complicated than that.”  Having just re-read […]

  • Assessment: stay focussed. – Thinking aloud.
    Posted at 19:17h, 19 October

    […] assessment is complicated. As Tom Sherrington (@headguruteacher) sets out here, national assessment criteria are based on performances for a national cohort and are […]

  • More problems with Progress 8 | Roger Titcombe's Learning Matters
    Posted at 16:28h, 27 October

    […] – that is derived from the raw scores on two tests in different subjects.  If you read my posts The Data Delusion or The Assessment Uncertainty Principle, you will see how far we move away from understanding […]

  • Helen
    Posted at 09:06h, 23 November

    Fascinating and troubling…

    My question is about how we lead in this context? How do you message this with your SLT and staff more widely? How do you reconcile the frustration and doubt with the need to “get on with it” and work within the deeply flawed system? That’s the part I’m stuck on…. you can dispatch with lesson gradings easily enough but the stuff “higher up the chain” than that? We don’t have a choice to opt out.

  • CURMUDGUCATION | Mister Journalism: "Reading, Sharing, Discussing, Learning"
    Posted at 20:27h, 27 November

    […] The Data Delusion […]