Assessment reform: students' and teachers' responses to the introduction of stretch and challenge at A-level
Abstract
This paper describes an exploration into a reform of the A-level qualification in England in 2008; namely, the introduction of the ‘stretch and challenge' policy. This policy was initiated by the exams regulator and determined that exam papers should be redesigned to encourage the application of higher order thinking skills, both in the classroom and in examinations. The present study comprised two strands that explored the perceptions of students (n = 39) and teachers (n = 27) regarding the degree to which the incorporation of opportunities for stretch and challenge in the new examination papers had been achieved, and the likely effects on teaching, learning and exam preparation. On the whole, students and teachers welcomed the stretch and challenge policy and there were some indications that changes to the design of question papers could have some positive backwash effects.
Introduction
A-levels are the most commonly used qualifications for university entrance in the UK, with the examinations usually taken at age 18 by approximately half of the age cohort. In recent years, some UK education commentators have argued that A-levels do not stretch the most able students. Indeed, Bassett, Cawston, Thraves, and Truss (2009) recently called them ‘ersatz examinations', suggesting that they did not assess depth of knowledge and understanding. Further concerns have been raised that students' responses to A-level exams have become formulaic and reliant on rote learning (e.g. Alton 2008) and that there is a lack of discrimination in A-level grading, with ensuing problems for selection to higher education. The former Labour government introduced a number of policies seeking to tackle these issues in the Education and Skills White Paper (Department for Education and Skills [DfES] 2005). This included the introduction of an A∗ grade at A-level, with the aim of dealing with the issue of differentiation – the A∗ grade would reward outstanding performance and allow for improved discrimination between the best performing candidates. In addition, the Extended Project was introduced to add depth and breadth to the curriculum, stretching advanced students and testing a wider range of skills. A major element of the White Paper was the ‘stretch and challenge' policy, which included the introduction of new styles of examination questions which could stretch and challenge students in exams and in the classroom.
These new A-levels were introduced for first teaching in September 2008 and first certification in June 2009. It should be noted that Advanced Extension Awards (AEA) were introduced at A-level in 2002 to better differentiate between higher-level students and to give them the opportunity to demonstrate fully their knowledge, understanding and skills. The AEA was withdrawn after June 2009, partly due to low demand (Baird and Lee-Kelley 2009) but also to avoid duplication of provision following the introduction of the stretch and challenge policy in the DfES (2005) White Paper.
Ruth Kelly, the then Secretary of State for Education and Skills, wrote that her primary purpose in introducing the stretch and challenge policy was to: ‘encourage our brightest A-level students to develop broader skills and knowledge than those currently required by A level' (Kelly 2006, 1). As the maintenance of A-level standards over time was expected, the stretch and challenge policy could not be allowed to impact upon the grading of the examinations. Rather, the policy was intended to increase the experience of difficulty for the students – stretching them, but not necessarily rewarding them in a different way. Although the UK has since undergone a change in government there is no suggestion, as yet, that the stretch and challenge policy will be rescinded. On the contrary, the recent White Paper (Department for Education 2010) specifically states a desire to explore how A-levels can provide deep synoptic learning while actively comparing the performance of UK pupils against international benchmarks.
Manifesting stretch and challenge
-
Use a variety of stems in questions, such as analyse, evaluate, discuss or compare, to avoid a formulaic approach and to elicit a full range of response types.
-
Avoid assessments being too atomistic. The requirement here was to encourage more synoptic responses that could draw upon students' learning in other units or subjects.
-
Include some requirement for extended writing, except where inappropriate (e.g. mathematics). Extended writing demands more of the student in terms of structuring their knowledge and drawing upon cognitive resources to create their response.
-
Use a wider range of question types to address different skills (i.e. not only short-answer/structured questions).
-
Improve synoptic assessment by asking students to bring knowledge and other prescribed skills to bear in answering questions, rather than simply demonstrating a range of content coverage.
The guidance also outlined the features which examiners could expect to find in the responses of high-performing students: namely that high-performing students, or students with ‘expert' knowledge, should be able to appreciate the underlying nature of a task and not be distracted by a task's surface-level features. Accordingly, high-performing students should be able to respond to novel situations that require the application of skills and knowledge that they have developed in other contexts (i.e. with synoptic writing and use of applied stems). The intention was that the new assessments would provide greater opportunity both to challenge high-performing students and for students to demonstrate their ability across a range of question types.
Backwash effects
For the stretch and challenge policy to have its intended effects in the classroom, the new assessment styles would have to encourage new approaches to teaching and test preparation – a process known in the literature as washback, backwash (Alderson and Wall 1993; Bailey 1996; Cheng and Curtis 2004) or measurement-driven instruction (Popham 1987). Throughout the rest of this paper, we use the term ‘backwash', which ‘occurs when a high-stakes test of educational achievement … influences the instructional program that prepares students for the test' (Popham 1993 cited in Chapman and Snyder 2000, 460).
In their reviews of the literature on language testing, Alderson and Wall (1993) and Bailey (1996) concluded that there was relatively little unequivocal empirical evidence to determine the exact nature of backwash, its mechanisms, what constituted positive or negative backwash, and how to foster the former and limit the latter. However, there are a number of factors that may influence backwash. Foremost is the potential for teachers and teaching practices to mediate the relationship between the test and students' learning. Stecher, Chun, and Barron (2004) conducted a survey to determine the effects of a programme of assessment-driven reform on the teaching of writing. Teachers reported making changes to the allocation of time, their emphasis upon different aspects of the subject content and their teaching practices. However, teachers also reported spending more time on subjects that were tested than on non-tested subjects. Furthermore, the core of writing instruction was unchanged from before the introduction of the new tests. Therefore, while the new tests and standards appeared to influence teaching practices, they also had the effect of narrowing that teaching to cover only aspects of the subject that were to be tested.
High-stakes public assessments have many consequences for the test-takers and those delivering the curriculum. The primary task of educationalists is to ensure that the consequences are as constructive as possible, making a positive contribution to learning (Stobart, 2003). Consequently, it is clear that exploring students' and teachers' perceptions of the stretch and challenge policy implementation could contribute to our understanding of the potential effects of this initiative. As outlined in the next section, the present study comprised separate student and teacher strands (for full details see Baird, Chamberlain et al. 2009; Baird, Daly et al. 2009), which explored both groups' responses to the new policy and the resultant examination papers that aimed to provide stretch and challenge opportunities for the most able students.
Method
Student interviews
As the stretch and challenge policy was aimed at the most able A-level students, we sought to recruit participants who had recent experience of success at A-level. A purposive sampling approach was therefore adopted to recruit first-year undergraduate students from two high-ranking universities in England, which were selected pragmatically based on their locations. The study focused on students on the BSc Biology and BSc Psychology programmes, whose entry had been conditional on achieving a grade A or grade B in either subject. Biology and psychology were selected as they utilise two question types that are dominant at A-level: essays and short-answer questions. These question types offer very different challenges in terms of the cognitive demands they place on students and the kind of exam preparation they require of students. It was therefore considered that the research would give a broad insight into how students respond to both ‘factual' (short-answer) and ‘discursive' (essay) exam questions.
An advertisement was posted via email from the course administrators of the biology and psychology programmes at the two universities, which attracted expressions of interest from 73 students. The aim was to have a balanced sample comprising 10 high-achieving students from both subjects in each university, giving a total of 40 interviews. Table 1 shows that ultimately 39 interviews were conducted, five of which were with male participants.
Procedure
The interviews were conducted by a team of five interviewers. Each interviewer followed a semi-structured interview schedule designed to explore students' responses to the stretch and challenge policy. Of primary interest were the students' perceptions of the new specimen A-level question papers which had been produced by the three main awarding bodies prior to the use of the new papers in ‘live' exams. At the time of the research, the specimen papers were publicly available on the awarding bodies' websites in order to assist teachers and students with their preparations for the first exams in June 2009. During each interview the participants were given two examination papers: the paper they themselves had taken in summer 2008 and the corresponding ‘stretch and challenge' specimen paper. Participants were asked to compare the two papers, identify any differences, and share their thoughts on the kinds of challenges that each paper presented.
A-level subject | University A | University B | Totals |
---|---|---|---|
Biology | 8 | 10 | 18 |
Psychology | 5 | 16 | 21 |
Totals | 13 | 26 | 39 |
Participants were informed that all data would be anonymised prior to analysis and that they could withdraw their participation at any time. With the consent of participants, all interviews were audio-recorded and later transcribed verbatim in conjunction with interviewers' field notes. Each interview lasted approximately one hour.
Teacher focus groups
In addition to the interviews with students, four focus groups were conducted to explore teachers' views of the stretch and challenge policy. The aims were twofold. The first was to determine whether teachers believed that the new specimen papers would indeed offer stretch and challenge to the most able students; the second to explore their perceptions of the potential effects on their teaching practices and students' learning experiences. It was considered that a focus group methodology would encourage the sharing of experiences and beliefs regarding assessment over and above what could be achieved during a one-to-one interview with a researcher. Teachers were recruited from a range of subject areas to encourage comparisons of the effects of stretch and challenge in different subjects. For example, it was anticipated that the stretch and challenge policy may bring about different responses in humanities subjects than in science subjects.
The teacher participants were drawn from two educational centres in the north and two in the south of England, comprising an independent school (11–18-year-olds) and a post-16 Further Education college offering A-level qualifications in each region. The schools and colleges were selected on the basis of their large entries for A-level qualifications and, accordingly, the likelihood that teachers would be aware of the stretch and challenge policy and its intended impact on A-level students. Teachers were recruited by the headteachers, who had granted permission for their participation in the research. The four focus groups ranged in size from 3 to 11 participants, with a total of 27 teachers overall. The smaller focus groups resulted from late drop-outs due to changes in teaching commitments. Table 2 shows that eight participants taught in independent schools and 19 in comprehensive sixth-form colleges.
Procedure
The focus group discussions lasted between 30 and 60 minutes and, with the consent of participants, were audio-recorded for later verbatim transcription. Two researchers attended each focus group; one as facilitator and the other as observer. While the observer took notes summarising the content of the discussions, the facilitator guided the discussions, requesting elaboration or directing the conversation as necessary (Krueger and Casey 2000). A focus group schedule was used to ensure that the discussion captured teachers' responses to the specimen question papers and their reflections on how the policy would impact upon their teaching practices. The stretch and challenge specimen exam papers were drawn from the three major Awarding Bodies1 in England and covered a range of A-level subjects such as art and design, geography, information and communication technology and physics. Teachers were asked to share their responses to the question papers from their own subject area, or that which was the closest.
Focus group | Type of centre | Region | Males | Females | Total |
---|---|---|---|---|---|
A | Further education college | South East | 4 | 7 | 11 |
B | Further education college | West Midlands | 4 | 4 | 8 |
C | Independent secondary school | South East | 3 | 2 | 5 |
D | Independent secondary school | West Midlands | 2 | 1 | 3 |
Analysis and findings
The selection of themes was driven by a combination of the research questions and the data itself, with the findings from the two strands then considered collectively. Focus group and interview data were analysed separately using thematic analysis (Maxwell 2005), which involved several readings of the transcripts while searching for and checking the validity of emerging dominant themes. Participants' comments were coded using NVivo software on later readings. Some codes were produced from the notes provided by the interviewers, facilitators, or observers, while others were generated in response to the data. The dominance of a theme was defined by its extensiveness – the degree to which similar comments were made by several people – and the strength of feeling for a particular topic. Findings from analyses of student and teacher data are presented together under the study%s main aims.
The two research strands created a wealth of data and some lengthy participant narratives. The quotations that are included below have been selected on the basis that they best illustrate the content of the discussions. Care was taken to ensure particular participant perspectives were not excluded and quotations that represent contrasting views are provided where appropriate.
Comparing the old and new A-level exam papers
I think you'd have to work harder. It%s kind of thinking for yourself, instead of just memorising notes. So, yeah, pulling all your knowledge together, I'd find that difficult. (Psychology student)
This led to claims that the new papers were therefore less predictable than the old papers, and that this lack of predictability could increase stress and uncertainty prior to and during the exam. As typically high-achieving students, however, some suggested that the applied questions represented a better test of their knowledge and could encourage a more creative approach to the examination.
The student participants also drew attention to the fact that the new specimen papers appeared to contain a greater number of questions as a result of examiners incorporating more short-answer-type questions (worth one or two marks only), which in some cases replaced the use of longer essay questions (typically worth 25 marks). Students' experiences of anxiety in the exam context have been linked to the number of questions that have to be answered in the specified time allocated, with students sometimes feeling panicked and rushing to complete all questions within the allotted time (Chamberlain, Daly, and Spalding 2011). With the new specimen papers appearing to contain more questions, some students commented that they may decrease the likelihood of completing all questions and may increase students' levels of anxiety.
If you look at a question and it%s 30 marks and you don't know anything about it, you panic and it%s a lot harder. Whereas, say, like a question that%s then broken down into four parts, although it could still be asking the same things, the guidance just makes it a bit easier. (Psychology student)
If you get [a question that asks you to] ‘describe' they just let you reel off a load of information. They're really good questions and I can't see any of them [in the new paper]! (laughs) (Biology student)
You'd be more challenged in that it looks like you need to know a broader content. Like, you wouldn't be able to like just revise one [topic] and then hope that it came up. (Psychology student)
I think it would have made the revision more challenging because it%s quite specific. You'd need to know quite a lot of detail. (Psychology student)
I think this [new paper] tests a different skill. It doesn't test what you know about biology, it tests how you use your biology. (Biology student)
This [new paper] is a lot more practical than the sort of papers I've sat, maybe a bit more scientific. There%s a bit more terminology in there. I think it%s a step forward. Maybe harder to revise for. Probably more enjoyable in the grand scheme of things. (Biology student)
The teacher participants also remarked upon an apparent increased reliance on short-answer questions in the new specimen papers. Interestingly, and in contrast to the student participants, the teachers noted the use of many factual recall questions, which they themselves identified as in conflict with the stretch and challenge policy. Nonetheless, although teachers did not generally perceive the old papers to be too easy or, hence, as needing to be made more difficult, many welcomed the attempts of the awarding bodies to introduce more stretching elements.
Reflections on the need for stretch and challenge
Because more people can understand the techniques needed to take an exam and more people are passing, everyone%s complaining. But why are they complaining when everyone%s getting good marks? It seems quite a silly thing to do. Why would you want to make it so that less people can understand and get a good mark? (Biology student)
I thought [my exams] were hard enough and we all struggled. No one just sailed through. So I think they are challenging and they're not getting easier. I think people are just getting better at teaching them and helping people to get the right answers. (Psychology student)
I think the scope is there for [students] to stretch and challenge themselves but they're not going to get rewarded for it necessarily. (Design and technology teacher)
I think in my subject area I'd go a step further than that and I think that A-level actively discriminates against the most able students. I don't think the most able students are necessarily the people who get the highest grades. I think there%s too much recall [in the old papers] … they need to be able to regurgitate definitions, which are fairly insignificant really. (Chemistry teacher)
Changing exam preparation strategies
We had so much exam practice that no question could really surprise us. (Psychology student)
The teacher had told us to predict [the questions] 'cause he said there's always a routine and you can always guess, but he didn't guess it right that time. I still got [a grade] A but I nearly started crying when I saw the questions. (Psychology student)
I was quite gutted when the time was up because I had so much more to write. (Psychology student)
You worry about all the different types of questions they could ask you but sometimes it%s an anti-climax. You've prepared reams and reams of information and they ask the question in its simplest form. (Psychology student)
That%s the worst, when you just have to say a phrase [to get the mark], so then you just end up learning the phrase and putting the phrase in your answer rather than learning the answer itself. (Biology student)
Given the high-stakes nature of A-levels, it is appropriate that students receive coaching in how to succeed. Interestingly, though, some students suggested that their exam preparation strategies could divert their attention away from learning the subject content.
… try a challenging lunchtime workshop and they are not going to eat until 4 o'clock at this rate you know. And I've noticed that I'm clashing and they say ‘Oh, I can't do that because I've got a Law workshop' or ‘I've got a science workshop'. (Science teacher)
It%s about, ultimately, tuning it up, delivering stretch and challenge in the classroom, not in the examination. I think it%s too late then. (Physics teacher)
I think they should all be challenging because it%s an A-level and you need to be challenged, and it%s hard and you need to prepare yourself for university. But in the same sense, if you've done your revision, it should be challenging but answerable, so the challenge should all really be before the exam. (Biology student)
You don't really want to be challenged [in an exam]. You just want to get down what you know. I think you should be challenged in class. In an exam your brain can be a bit everywhere. (Psychology student)
Discussion and conclusions
It was clear from the findings that students and teachers were in general agreement about the new specimen papers. Both groups observed that the new papers contained more applied questions – which were considered more challenging – and that they would require students to think more deeply during the examination. Many students and teachers noted that it would be more difficult to prepare for applied questions, yet it was accepted that these questions were a better test of knowledge which could encourage creativity and synoptic writing. However, many students and teachers identified that the new specimen papers also contained a greater number of short-answer questions, which mostly required factual recall. Although this may have been an attempt to increase breadth within question papers, it appears to be inconsistent with the stretch and challenge policy. This contradiction may be an inevitable consequence of attempting to assess breadth and depth within the same question papers.
It should be noted that only specimen papers were available for this research, and it is likely that these papers may have differed from the final question papers created for the subsequent live examinations. This was unavoidable in this instance and understandable in any case, given the difficult timing constraints often associated with policy initiatives, policy implementation, and the typically lengthy question paper design process (e.g. Baird and Lee-Kelly 2009). Consequently, no concrete conclusions can be drawn here regarding question papers and the effectiveness of the stretch and challenge policy. In addition, given that teachers and students had little experience of the new A-level question papers, it is difficult to make any strong inferences about backwash effects. However, the findings suggest at least the potential for positive backwash, with teachers and students recognising the possible benefits of stretch and challenge for teaching and learning.
Both teachers and students touched on the anxiety associated with examinations, with some fearing that increasing challenge at A-level would increase pressure unnecessarily, undermine confidence and have a demoralising effect. Many students expressed anger at the suggestion that examinations should be made more challenging. It cannot be determined from these results whether that anger is a result of perceived unfairness in assessment, a reaction to the popular debate regarding the ‘dumbing down' of education (see for example Paton 2007), or a combination of both. Regardless, it is understandable that students taking these exams might take umbrage at any inference which devalues their efforts.
Examination preparation was a dominant theme in the student and teacher narratives. Students recounted how they were trained by their teachers to perform in examinations, with teachers conscious that this training was an important aspect of their role. Both groups were aware of this highly strategic approach, relying heavily on past papers and mark schemes to maximise exam performance. Although it is appropriate that students received coaching in how to succeed, particularly given the high-stakes nature of A-levels, many teachers and students recognised the limitations of this approach, with its focus on testing rather than on learning.
Torrance (2007) argues that the challenge of learning at A-level, and in post-compulsory education more generally, has been removed by the high levels of transparency in assessment procedures, processes and learning objectives. This transparency facilitates and encourages extensive coaching and practice, as teachers and learners use the available resources to familiarise themselves with the ‘tested' content. Torrance characterises this as a move from ‘assessment of learning', through ‘assessment for learning', to ‘assessment as learning' – where assessment procedures and practices come to dominate and narrow the learning experience. This certainly appeared true of the students and teachers participating in this study who, almost without exception, characterised learning as a process which led to and enabled assessment.
… a blinkered conceptualisation of curriculum, the strong trend towards fine-grained prescription, atomised assessment, the accumulation of little ‘credits' like grains of sand, and intensive coaching towards short-term objectives, are a long call from the production of truly integrated knowledge and skill.
Although teachers and students alike commented that the place to be stretched and challenged was the classroom, rather than the examination, their highly strategic approach to teaching and learning raised a quandary. Would A-level classroom practices change when the assessments changed? Teachers and students perceived that the new question papers had broader content coverage and were clearly aware that this had implications for learning and revision. Some students felt that they would need to know and revise more of the specification content and many identified new questions which would require the application of knowledge. Students and teachers alike felt that these questions would be difficult to prepare for, which could increase stress and uncertainty. Despite this trepidation, there was some support for the applied questions, with some students and teachers suggesting that they provided a better test of knowledge and encouraged greater creativity.
It was clear, even at this early stage of the policy implementation, that teachers were reviewing their teaching strategies to coincide with the introduction of the new question papers. Moreover, teachers were actively seeking support with exam preparation. This has clear links to the stretch and challenge policy and is suggestive of the positive backwash that was intended. Given that they were accustomed to a plethora of past papers for the old exams, teachers particularly wanted to know what kind of responses would be rewarded by examiners of the new papers. This, unfortunately, hints at negative backwash effects: while teachers welcomed the possible benefits for student learning, they were reluctant to give up existing and, arguably, very successful preparation strategies.
And they are learning, whether we like it or not, that education%s about taking exams, when in fact it%s not.
Acknowledgements
This research was funded by the Qualifications and Curriculum Authority as part of their evaluation of the introduction of the stretch and challenge policy for A-level examinations. The authors would like to thank the students and teachers who participated and the staff who facilitated the visits. We would also like to thank Professor Jannette Elwood and members of the Assessment and Qualifications Alliance Research Committee for critically reviewing a draft of this work.
A preliminary version of these results was presented at the 10th Annual Conference of the Association for Educational Assessment-Europe, in Balzan, Malta, in November 2009.
The views expressed in this report are those of the authors and are not necessarily endorsed by the Assessment and Qualifications Alliance, the University of Bristol or the Qualifications and Curriculum Authority.
Notes
Biographies
Anthony L. Daly (BPsych, PhD) is a Research Fellow in the School of Education at the University of South Australia. His current research interests include test anxiety and student motivation.
Jo-Anne Baird (BA, PhD, MBA) is Pearson Professor of Educational Assessment and Director of the Oxford University Centre for Educational Assessment. Her recent research includes assessment policy, examination standards and reliability.
Suzanne Chamberlain (BSci, PhD) is a Senior Research Associate in the Centre for Education Research and Policy at AQA. Suzanne%s current research interests concern learners' experiences of education and assessment.
Michelle Meadows (BSci, PhD) is Director of AQA%s Centre for Education Research and Policy. She is responsible for research supporting the development of education assessment policy.