1. Introduction – “the most assessed pupils in the world”
“There is no such thing as a fair test. The situation is too complex and the notion too simplistic.”- Gipps & Murphy, (1994: 273).
There exist a wide range of types and methods of assessment, however the title of this paper refers predominantly to summative assessment, that being assessment which takes place at the end of a teaching period or unit of work, and which is designed to show what each pupil has learnt.
When it comes to summative assessment, England has a great number of standard assessments. Foundation Stage teachers complete a Foundation Stage Profile which aims to look at the whole child; we have end of Key Stage 1 SATs for Reading, Writing and Maths; end of Key Stages 2 and 3 SATs for English, Maths and Science; and finally students take a range of GCSEs at the end of Key Stage 4, before possibly choosing to study for further optional qualifications such as A levels.
In addition to this, most schools choose to carry out their own summative assessments at other stages. For example, at Moreland Primary School in Islington, the school in which this writer teaches, we take part in the Optional SATs for English and Maths at the end of Years 3, 4 and 5, and we conduct our own summative assessments for these subjects halfway through each year. By the end of Key Stage 2 at Moreland, a pupil will have taken part in approximately ten weeks of English and Maths tests – nearly an entire term. Such practice is common among schools, and surely agrees therefore with the statement that English school children are amongt the most assessed in the world.
In this paper, the writer will argue that although there is some necessity for summative assessment in a pupil’s school career, our country’s emphasis on improving summative assessment results has led to a culture of targets, a focus on league tables, a limited curriculum and an idea of what it is important to know and be able to do, an emphasis on target groups of pupils which often exclude those above and below and the over-use of setting, streaming and ability grouping. All of this, the writer will argue, has limited teaching and learning styles and methodology, and has greatly affected pupil motivation and self-esteem.
In addition, the writer will further argue that such a focus on summative assessment, and the culture that it creates, has the greatest impact upon pupils from ethnic minority backgrounds, particularly those with English as an additional language (EAL), and will outline the reasons as to why this is the case. Pupils in England from different ethnic minority backgrounds receive startlingly different average assessment results; some groups attain much higher than the overall England average, while other groups (the most dominant ethnic minority groups in fact) attain far below. As well as discussing the issues in the above paragraph as causes, this paper suggests flaws in the assessment system itself which could be changed in order that the differences in results between ethnic groups are minimised. Alongside this, the writer will look at the relationship between ethnic group, gender and class, and the ways in which she believes that the education system in this country is set up to favour certain categories of pupils above others, arguably resulting in unintentional institutional racism.
Through this discussion, as she is a primary school teacher, the writer will focus upon Key Stage 1 and 2 SATs and the Foundation Stage Profile, but will give attention to other assessments where it is important to do so.
2. Different types of assessment
Although this paper will focus on a discussion of the uses and misuses of summative assessment, it is useful to remind ourselves of other forms of assessment as well. In addition to summative assessment, as described above, there exist diagnostic, formative and self/peer-assessment. Diagnostic assessment involves assessing what the pupils know prior to teaching through observation and interaction, and what they therefore need to be taught; formative involves assessing (commonly through marking or through working with pupils) what pupils have understood so far and therefore how they need to move on; and self/peer assessment involves supporting pupils to assess their own or their classmate’s understanding of their learning and how they can move on. All of these types of assessment, as opposed to summative assessment, form part of the daily work of teachers.
Black and Wiliam (1998: 2) argue that it is formative assessment that is the most important in raising the achievement of learners as it is “actually used to adapt the teaching work to meet the needs” and is “at the heart of effective teaching”. They go on to argue that “improved formative assessment helps the (so-called) low attainers more than the rest, and so reduces the spread of attainment whilst also raising it overall” (1998: 4). This writer would suggest that, with the existence of frameworks and schemes of work that try to cover too much and encourage teachers continually to plan too far in advance, teachers are losing the skill (or possibly the courage) to use their own formative assessment in their planning. This writer would further argue that a greater focus on improving the skill of formative assessment amongst teachers would be much more beneficial in raising standards than such an emphasis on summative assessment, and critically, that as EAL pupils in particular are more likely to be among the lower attainers in a school, a focus on this area would benefit such pupils the most. It is highly unfortunate therefore that so much emphasis is placed by so many upon summative assessment, and that it is through this that a school is judged.
3. Summative assessment results of ethnic minority pupils
Overall, ethnic minority pupils attain below average in the country, as discussed by Smith in our course session on 13/5/08. The data for Islington showed a wide gap in end of Key Stage 2 English three year rolling average results, with 78% of white British pupils attaining L4+, while only 69% of ethnic minority pupils attained this. The results for other subjects also showed large differences.
However, as the Equality and Human Rights Commission warned, there are “dangers of viewing ethnic minority pupils as a homogenous group”, pointing out that there was a “30% point difference in Key Stage 2 Maths between Chinese at the top end … and Black Caribbeans at the bottom” (TES, 30/5/08). This corresponds with Gipps and Murphy’s assertion that data analysis should be sophisticated enough that it does not hide the continuous underachievement among certain ethnic categories of pupils, pointing out that African, Asian and Indian pupils attain well above the London average, Pakistani and South East Asian pupils also attain above, but Bangladeshi, Turkish and Caribbean pupils attain significantly below (1994: 233).
When Smith’s data was broken down into ethnic categories, it showed that Bangladeshi and Black Caribbean pupils attain almost as highly as white British pupils with 73% and 72% attaining L4+ respectively, but black African pupils attain below at only 68%, and Turkish pupils attain well below at 61%, highlighting a genuine problem amongst certain ethnic groups of pupils in this borough.
Likewise, in Moreland, our data analysis by ethnicity of the Key Stage 1 and 2 SAT results for the year 2007-08 showed that Somali pupils attained above the school average, while Bangladeshi, Black African and Black Caribbean pupils attained below, and Turkish pupils attained well below.
As Gipps and Murphy (1994: 263) emphasise, a large amount of research has highlighted that there are no biological explanations as to the differences in performance between ethnic groups, therefore we must identify the true explanations for the differences. These are likely to be part societal, in which the education system itself can have only limited impact, but also largely connected to assessment and the overall education system itself, and it is the latter on which this writer will now seek to focus.
4. The uses of summative assessment
Summative assessments such as SATs and Foundation Stage Profiles are arguably very useful as a standardisation test to ensure that all schools – which are of course accountable to parents, pupils, local authorities, Ofsted, the government and ultimately the taxpayer – are performing highly and achieving targets. They can also lead to overall improvements in basic skills such as reading, writing and arithmetic. As well as this, as opposed to teacher assessments, the results of standardised tests are not dependent on individual or groups of teachers who may be susceptible to stereotyping certain categories of pupils, or who are intentionally or unintentionally racist.
In addition to this, the results from summative assessments can prove very useful in school planning and organisation. School leaders and teachers are able to identify easily pupils and groups of pupils who are underachieving and therefore can target resources towards them, as well as being able to identify gifted and talented pupils and those with learning disabilities, ensuring that appropriate support is provided to all of these groups of pupils in turn. Arguably of course, schools can do all of this without summative assessment results through teachers’ knowledge of their pupils, but having the evidence to justify actions and use of resources proves useful in the current climate of performance management and accountability.
Furthermore most secondary schools, and an increasing number of primary schools including Moreland at Year 6, use summative assessment results to place pupils into sets or streams, and within classes pupils can be placed into ability groups for core subjects, as happens in all year groups at Moreland. This way of teaching is seen as being much more manageable from the teacher and school’s perspective, and supports schools in being able to reach targets set for them by the local authority by making it easier to focus upon particular groups.
As well as this, schools and local authorities also study the results of summative assessments by ethnicity. At Moreland, we use this data in our planning of target groups of pupils, in identifying the specific needs of different ethnic minority groups, for policy development, and to inform our allocation of resources. For 2007-08 for example, our data highlighted that Turkish pupils are continuing to underachieve, therefore we increased our Turkish Teaching Assistant’s hours to full time, took part in a Turkish Maths Project and the Turkish GCSE, focused upon supporting Turkish parents, and this writer’s time as EMA Coordinator was concentrated upon Turkish pupils. All of this supports our school in our efforts to raise the attainment of certain groups of pupils, and thereby meets our requirements under the issue of race equality.
All of these uses of summative assessment results are now standard practice in schools and can be very beneficial both to schools and their pupils. However, the pressure upon schools to meet targets year upon year is great, and this has resulted in just as many misuses, as well as uses, of assessment.
5. The misuses of summative assessment
Over the past decade, with the increasing importance placed upon improving summative assessments results, along with the ongoing focus placed by much of the media and some groups of parents upon league tables, this has led to many changes in the way in which schools work.
While secondary schools have made use of setting and streaming for a long time, it is only recently that primary schools are making increasing use of it, in addition to an established practice of ability grouping within classes, as it makes reaching the targets set for summative assessment results easier for schools and teachers to manage. However, Gillborn (2005) is against setting, streaming and ability grouping, saying that it reinforces the disadvantages of pupils from ethnic minority backgrounds. He argues that “because lower-ranked groups cover less of the curriculum, they have a cumulative effect that can be devastating”, i.e. that if pupils are put into lower groups at a young age in which less is taught and expectations are lower, then it is unsurprising that there are markedly different results, with pupils never being able to ‘catch up’ to a higher group and therefore suppressing potential to achieve highly. To add to this, this writer notes that many of the lower-attaining pupils at Moreland are frequently taken from other subjects in order to attend additional literacy and/or maths sessions, resulting in their not receiving the learning they are entitled to in these other subjects and therefore limiting their potential in these subjects throughout their entire school career as well. Gillborn further states that black pupils are frequently over-represented in lower groups and sets, and finds that they are very under-represented in gifted and talented schemes, and argues that this depression of black pupils is institutional racism. As Gipps and Murphy argue, “such grouping, though it might have pedagogical justification, limits actual equality of opportunity and is therefore inequitable” (1994: 239-40).
Following on from this, another problem with ability grouping and the focus on target groups is that more able pupils and pupils with learning disabilities tend not to have their needs met appropriately, as the majority of a school’s resources go towards the focus groups whose attainment will support them in reaching their targets.
Another problem is that, as well as whole-school targets, many local authorities set targets and grade predictions for attainment of different ethnic groups, particularly for GCSE, often using Fisher Family Trust. However, as argued in the TES (27/5/08), having different targets and grade predictions for pupils from different ethnic groups means we are stating from the start that we have higher expectations for some ethnic groups over others. They state that “the targets could distract from pupils’ personal abilities and result in low expectations” of certain ethnic groups, which would again be institutional racism.
A further consequence of the focus on summative assessment results is a narrowing of the curriculum that is taught in schools. Primary schools have become much less practical and creative, and much more focused upon basic skills, and this writer suggests that there is therefore much untapped potential among pupils in a whole range of areas such as art, music and sport. Further, it can be suggested that this potential is likely to remain untapped due to certain pupils’ loss of interest in learning due to the limited curriculum to which they are exposed. As Burgess-Macey states, the SATs become an “alternative, narrow curriculum … produces sterile learning situations and depresses real levels of achievement” (1994, in Keel, 1994: 57). As well as this, the focus on basic skills has led to the creation of the literacy and numeracy frameworks in primary schools, which have arguably limited the variety of teaching and learning styles being used, with new generations of teachers being trained only in how to deliver these frameworks, whether or not they are suitable for all pupils at all times.
Following on from this, there is also a concern that there is an increasing culture in our classrooms of ‘finding the right answer’ and gaining rewards. As Black and Wiliam (1998: 8-9) argue, pupils look for ways to obtain the best marks and are reluctant to ask questions out of fear of failure; if they have difficulties or poor assessment results, they avoid investing further effort in learning which could lead to further disappointment, and this enhances the extent of underachievement and widens the gap between the higher and lower attainers.
All of the above have an impact upon pupil motivation and self-esteem; if a pupil is not having their needs met, whether they are able or have a learning disability, or if a pupil has spent their entire school career in a low ability group and therefore has never been provided with the educational opportunity to attain higher and has stopped investing effort in learning, or if the curriculum and teaching styles are so limited that they do not interest them or recognise their skills, or if ongoing low summative assessment results give your school a reputation as being ‘bad’ (as is the case in many urban, challenging schools – those which ethnic minority pupils are most likely to attend), then motivation and self-esteem will of course be low. This writer would go on to comment that such action by the education system is only a few steps away from explaining rising mental health problems among young people, which is particularly prevalent among black young people, the rise in anti-social behaviour and youth violence, and the problem of so many young adults not being in education, employment or training due to a loss of willingness to try.
6. The summative assessment system
As Gipps and Murphy (1994: 228) have observed, researchers tend to look at environmental/societal factors to explain differential group performance, such as social class and school attended, rather than factors within the assessment system itself. As part of their research, they discuss the development of the SATs in the early 1990s and, interestingly, state that in the earlier stages of development of the Key Stage 1 SATs the “development teams took seriously the issue of assessing young bilingual children”, having trialled the delivery of bilingual assessments and the use of bilingual support staff where available, made attempts at removing cultural bias in the tests, and carried out much analysis of the results and surveys completed by teachers. These trials resulted in bilingual pupils performing better in the SATs than in teacher assessments, arguably due at least partly to the elimination of racism, although still attaining lower results than the overall average (1994: 187-188).
However, the trials highlighted to the test developers many problems that the SATs would have in assessing different groups of pupils. Indeed, when discussing the results of teacher surveys about the early SATs, Gipps and Murphy note an NFER study showing that only a third of teachers believed that SATs were satisfactory for assessing EAL pupils (1991b, in Gipps & Murphy, 1994: 195). Difficulties included EAL pupils as more likely to misunderstand instructions, being more confident when instructions have been rephrased in their first language; the assessment of EAL pupils in the earlier stages of English acquisition; that assessors may interpret criterion in different ways for different pupils; and that there are some questions and contexts which are less meaningful to some groups of pupils, suggesting that a range of modes and task style should be used, as well as an expansion of the range of indicators or attainment targets, in order that pupils who are disadvantaged in one test or by one set of indicators may have an alternative opportunity to provide evidence of their learning. While the researchers admit that it is impossible to completely decontextualise assessments, and that their work actually concludes in a necessity for a great many more tests than we have currently, they sense that this is what must happen in order to provide equal access and opportunity to all of our pupils (1994: 188-194). Sadly however, when Gipps and Murphy go on to discuss the later development of SATs, they strongly note that, despite all of the findings of the earlier trials, concerns about the assessment of EAL pupils were actually “less evident as time went on” (1994: 187-194), and that this was due to reasons of manageability and the popular idea that there being just one standard assessment is the fairest way of testing.
The issue of context in tests, this writer would assert, is not, however, just a concern for EAL pupils, but also one for different ethnic minority groups whose first language is English, and also for different genders and pupils of different socio-economic backgrounds, or class. The point is well-made by Lyseight-Jones who comments that “if test designers presume that different populations are similar, or expect that different populations should operate to the conventions of only one societal group, then the designer will also expect that the results of the tests which they design should be treated as valid, even when they cannot be so” (1994, in Keel, 1994: 23). This writer would emphasise the point by asking whether it would be fair to provide an English SAT test to an English-speaking pupil in Jamaica of the same age, and strongly argues that it would not be, largely for contextual reasons. If it is agreed that this is the case, then it must be agreed that having the same very limited number of contexts in tests for all pupils is not only invalid but is arguably institutional racism towards certain groups of pupils, as the assessment system itself is setting up some groups of pupils to attain more highly than others.
To add to this, when discussing attainment targets, Sturman and Francis comment that they are simply “fallible statements made up by a group of people … they do not necessarily reflect the ways individual children learn nor the sequence in which they learn” (1994, in Keel, 1994: 70). This writer agrees, recalling instances with many pupils, but with EAL pupils in English in particular, when they have been able to carry out several aspects of a higher level’s attainment target while not yet being able to carry out aspects of lower levels, resulting in their given level not actually being a true record of what they are able to do.
This fallibility of statements is, in this writer’s opinion, particularly a concern in the Foundation Stage Profile statements. As Burgess-Macey notes, a very small number of behaviours and learning have been selected and promoted as significant above all others (1994, in Keel, 1994: 59), and furthermore that the statements show cultural bias. It is pointed out that some early learning is common to different groups but some is different; for example, “dresses and undresses independently and manages own personal hygiene” (PSED1 #4) is arguably culturally biased as not all cultures see this as something that young children should do. Burgess-Macey also highlights a concern that, as young pupils can be racist towards each other, this can lead to aggressive behaviour in some young black pupils, which is then recorded as poor behaviour (1994, in Keel, 1994: 54), and this writer imagines that they would therefore score low levels in PSED2 in particular.
Moreover, there are many statements under CLL and PSED that EAL pupils may be able to do in their first language but are unable to do in English, such as “uses talk to organise, sequence and clarify thinking” (CLL4 #7) and “communicates freely about home and community” (PSED3 #2). This results in these pupils attaining a lower level than their peers despite being bilingual and therefore having actually acquired more linguistic skills and knowledge.
Following on from this, although many Foundation Stage pupils in settings such as Moreland Children’s Centre are still in the very early stages of English acquisition, there exists nothing in the assessment system to highlight the skills that they have acquired in their first language, nor the cultural understandings and knowledge that comes with it, highlighting therefore that – despite the rhetoric – the education system does not truly value the linguistic and cultural diversity of the pupils it is educating. In theory, the Profile Book should be a record of the whole child, including all of the additional learning that they have achieved. Yet due to the range of languages and limited resources, as well as minimal understanding of pupils’ home lives among teachers, this is usually not the case. Furthermore, the increasing emphasis placed upon the Foundation Stage Profile statements, as opposed to the Profile Books, means that the whole child is no longer the focus; having a checklist limits what teachers choose to observe and clouds everybody’s judgments about what it is important to be able to do. As is already the case with older pupils, our youngest children are therefore already being required to fit into a chronological series of learning, limiting teachers’ and parents’ ideas of their pupils’ abilities before they have even begun statutory schooling. As Burgess-Macey states, a checklist such as this serves the interests only of people who want an easy method of comparing narrow achievement on entry with narrow achievement at age seven, and comparing school to school (1994, in Keel, 1994: 56).
Making changes to the assessment system itself would be a vital component in ensuring, as much as possible, that there is equality of access for all ethnic groups and that unintentional institutional racism is eliminated. This writer would suggest that a minimum change in order to achieve this goal would be for an increase in the number of reading and writing tests in the English SATs, with different contexts and text-types used, and that these could be carried out over a longer period of time than just an exam week, such as over half a term. Currently, judgments are made about our pupils by looking at a very limited range of their learning and in a very short space of intensive time, which we at Moreland believe, along with teachers at a great many other schools, do not provide the greatest opportunity for our pupils to show what they know and can do.
In addition, this writer agrees with Gipps and Murphy (1994: 275) that names on tests should be replaced by numbers in order that there can be no bias among the markers, and that more effort should be made by all schools to ensure that EAL pupils are provided with oral bilingual translations of instructions in tests. Finally, this writer believes that there should be a greater focus placed upon Foundation Stage Profile Books, as opposed to the Profile statements, which can look at the development of the whole child, changing the emphasis from showing what pupils don’t know and can’t do, to showing what they do know and can do.
As Lyseight-Jones (1994, in Keel, 1994: 33) states, “what we need to do is to remain sceptical about the validity of the broad National Curriculum assessment outcomes, while at the same time speculating ‘what if it’s all true?’ and act as if it might be”. In remaining sceptical, this paper has argued that there are many areas of the education system, and the assessment system in particular, that need to change in order to minimise the differences in access as well as outcome that this writer believes currently exist within our system of summative assessment. It is her overall belief in fact that a system similar to the Foundation Stage Profile Books (or possibly the 14-19 Diploma or International Baccalaureate) would be beneficial for pupils in the five to fourteen age range, and would particularly support ethnic minority pupils. These would look at the whole child rather than solely having a narrow focus on three subjects, and would therefore have an emphasis on what the pupil can do rather than on what they cannot do, including skills and knowledge of other languages and cultures. They would include pieces of work by the child over a range of time and from a broad range of subject areas, contexts and text-types, acquired through a combination of everyday classroom work and a range of standardised tests taken over a long period of time. In order to maintain high standards and accountability, these records of achievement could of course be subject to moderation and inspection, particularly as this would continue to ensure that prejudice from individual teachers cannot have an impact upon results.
In addition, this writer recommends that setting and ability grouping should be kept to a minimum, with the norm for every subject being that every child has access to the entire curriculum and has the opportunity to demonstrate high levels of learning; school organisation should always be with a focus on what is best for the pupils, not on what is easiest to manage. As well as this, she would place a greater emphasis on formative assessment, eliminate the setting of differential targets for different ethnic groups, increase the range of teaching and learning methodologies being used, and would certainly eliminate league tables. All of this, the writer argues, would increase pupil motivation and self-esteem, lessen the gap between the higher and lower attainers, raise the attainment of ethnic minority pupils, and minimise unintentional institutional racism.
This writer will summarise her thoughts with a quote from Burgess-Macey (1994, in Keel, 1994: 60): “children are labelled according to what is written down and not on the basis of what we really know about them … if what is written misrepresents children’s achievements … we have failed our children”.
Black, P. and Wiliam, D. (1998) Inside The Black Box. London: nferNelson
Burgess-Macey, C. (1994) Assessing young children’s learning. In Keel, P. (1994) Assessment in the Multi-Ethnic Classroom. Chester: Trentham Books.
Gipps, C. and Murphy, P. (1994) A Fair Test. Buckingham: Open University Press.
Lyseight-jones, P. (1994) An inexact science – issues of assessment. In Keel, P. (1994) Assessment in the Multi-Ethnic Classroom. Chester: Trentham Books.
Sturman, E. and Francis, M. (1994) Assessing primary progress. In Keel, P. (1994) Assessment in the Multi-Ethnic Classroom. Chester: Trentham Books.
Gillborn, D. (1995) Race Inequality, “Gifted and Talented” Students and the Increased Use of “Setting by Ability”. London: University of London.
National Foundation for Educational Research. (1991b) An Evaluation of the 1991 Curriculum Assessment Report 3: Further Evidence on the SAT: Manageability and Relationships with Teacher Assessment. Gipps, C. and Murphy, P. (1994) A Fair Test. Buckingham: Open University Press.
Times Education Supplement. (30/5/08) Gender/Race/Class – Mind the performance gaps.
Times Education Supplement. (27/5/08) Grade predictions lottery.
Smith, G. (13/5/08) EMA Accredited Course: Ethnic Monitoring & Data Analysis.