11 Oct KS3 Assessment. 8 steps towards a workable system.
We are about to launch our KS3 assessment system. I’ve shared the full details in this post. We’ve arrived at this model after considering all the following issues/questions/factors:
1. Accept the reasons that NC Levels became a broken system.
This has been covered by lots of people in great detail, but here’s a quick summary
- Prose level descriptors were virtually impossible to apply consistently or meaningfully.
- Levels originated to define broad range attainment standards at the end of a Key Stage; they never worked as a ladder of progress, certainly not in sub-level form. They also did not hold water in relation to specific elements of content – bits of science or maths for example. Content X = Level Y is a nonsense because all knowledge has varying degrees of depth and complexity to which it can be learned. It pains me that in my school students had ‘6b’ written on a piece of work in some subjects. As if that could be known.
- The illusion of progress from 6c to 6b to 6a etc was divorced from assessment processes robust enough to measure to that degree of precision; a lot of that was driven by data tracking systems and teachers putting their finger in the air. Variation within and between subjects was huge – no amount of moderation could safely line up a 6b in Geography, French and Maths within schools or between schools.
- A very high proportion of discourse with parents and students and between teachers become focused on the numbers rather than the learning.
And no, something is not necessarily better than nothing.
2. Think about the way teachers in each subject evaluate standards and describe how to improve.
Every subject has distinctive features. To improve a piece of writing in English or History, you need to focus on some specific aspects in the context of the particular writing task alongside some general features that apply to all writing. It’s too complicated to generalise into a neat list; the feedback is too specific to each student. Grading the work can’t possibly communicate the areas for improvement. Here, the assessment in the micro is the key and students can only focus a few things to practice and improve at any one time. General descriptors of writing only make sense in relation to specific examples. The same issues hold in Art.
In Maths and Science, you cover lots of bits of content and some repeated skills each of which can only be measured in relation to specific questions. Tests with a certain range of questions are the most commonsense way to gauge progress. Of course, some students express ideas verbally too and that will also inform teachers’ judgements. A score of 80% tells you something; a score of 40% tells you something – if you know the test. However, it’s the specifics of which questions were right/wrong that contain the learning.
So – it’s complicated. It’s all in the detail. Any KS3 assessment system needs to ensure maximum focus remains on the detail. Aggregating things up to a simple measure is always going to be flawed.
3. Balance the micro and the macro: keep in mind the need for a system that actually helps students to learn but is also manageable in scale.
I’ve seen some systems being developed that have become heavily bureaucratic with lots of data entry required against prose descriptors and ‘can do’ statements. It’s tempting to think we could track progress along the axis of ‘solving algebraic expressions’ or ‘writing to persuade’ in a helpful and meaningful way. However, once you add up all the different axes required, you’re talking about a massive system with thousands of data points. I’d suggest there is a limit that is quickly reached before this is unsustainable. (I remember being presented with 17 different tick-sheets for every student in my class in the early 90s – when Science had 17 attainment targets in the National Curriculum. I left them all blank, knowing the system would die. It died in weeks.)
I think it all needs to be much more organic than that; lived, not tracked. Our Assignments are meant to be an organic tracking system to be annotated by students and teachers in books. We want the micro learning to be prominent but not cumbersome. I wouldn’t dream of putting all the assignments online – no-one has time to track at that level of granularity.
4. Don’t do ‘can do’ statements.
I know this is a popular path that different schools have gone down but I think it’s a mistake. As Daisy Christodoulou and others have shown, ‘I can do X’ only makes sense in relation to a specific set of questions.
For example, ‘I can explain why my heart beats faster during exercise’ or ‘I can use the past tense’ are statements that can never be ticked off securely. They depend on the degree of depth, the level of complexity, the context, the extent to which they are sustained. Even if the ticking off is staged (approaching, securing, mastering etc) it’s the same problem.
Not only are ‘Can do’ statements flawed in judging their completion, they also fall foul of Issue 3 – it’s going to be a massively cumbersome system. You may as well give out copies of the GCSE specification and tick bits off.
5. Embrace the value of formative tests.
I think we all need to re-write the popular slogan: weighing the pig doesn’t fatten it. Actually, as we should now know, weighing the pig does actually fatten it if we are talking about testing within a learning process, not just at the end. (Once again, this re-write is borrowed from Daisy C and derives from work by Robert Bjork and others. Testing fuels learning – it’s a fact.) Test scores from tests that focus on specific elements of learning are a very efficient and effective way to determine the depth of learning and to gauge progress. There is no value in then turning each test into some pseudo-scale ( 35/60 is a 5c?). The point of a test is that it tells you where learning and teaching are stronger and weaker, student by student and at whole-class level. Tests are about the micro, not the macro.
6. Accept that benchmarking is also needed:
The need to benchmark is a strong desire for all concerned: How good is good? In addition to the micro detail of what has been learned successfully, we need a simple indicator to link that to broader brush standards. I suppose this is what NC levels were intended for before they became corrupted. For us, this is the last piece of our system to be introduced. There are lots of issues to wrestle with:
- Are you re-creating the flaws of levels with numbers dominating the discourse and false measures?
- Are you reinforcing fixed mindset thinking by locking people into pathways?
- Are you making false connections from the messiness of the micro of real learning to the neatness of a macro data point?
- Are you risking introducing an institution-specific scale that can’t be moderated or linked to national standards?
- Are you using bell-curve markers (such as GCSE grades) as a ladder? This is always misguided.
We think our system can work. We’re looking at starting points, focusing on progress and we’re pegging it to GCSE grades. This is risky because no-one actually knows what the 1-9 scale looks like in practice; not yet. However, if we refer to it as a rough scale and keep stressing the approximate benchmarking, it will help to contextualise the details. If you know you’re working at grade 7 in bold terms, it puts the teacher feedback into context. But, it’s not a ladder – that’s a vital difference. Progress must be defined within the terms of the details of the subject.
It may be harder to sell the idea that each number is a range, not a point on a scale – but that’s what we’re saying:
7. Remember the assessment uncertainty principle.
The Assessment Uncertainty Principle is one of my favourite posts. We must not allow the illusion of fine tuned assessment to be created by sub-steps and fine-grading. Assessment is fuzzy and anything we do that suggests otherwise needs to be recognised and handled with care. Is a student on Grade 7 in our system, necessarily achieving at higher level than someone awarded Graded 6? Well, no. We’re simply projecting teachers’ judgements and offering a best guess based on our estimates to give a rough idea. Within the detail, a test score of 63% in Maths is going to be more meaningful, but actually, even then, only when we look at which questions the student got wrong and why.
8. Learn by doing
Personally, I reject the complaint that we should have continued with levels until a better system was devised. I also reject the idea that schools should have been handed a National Assessment System on a plate by the DFE. This process has been difficult but also invigorating and necessary. For the first time IN DECADES, teachers have had to think for themselves about what assessment should look like. This is Michael Gove’s greatest gift – even if he did it rather by accident. Nature abhors a vacuum – and so do teachers! The sharing and debate around this has been a highlight of the last couple of years.
I doubt very much that the system we’re using will be the same in three years’ time. But how will we know if it works unless we give it a go. We’ve put a lot of thought into it, rejected other models for good reasons, and ended up with something with a good chance of succeeding; something logical, manageable and well-reasoned. I’m happy with that. Let’s see what happens!