ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	10

Descriptor

Evaluators	10
Foreign Countries	10
Writing Tests	10
Language Tests	5
Second Language Learning	5
Writing Evaluation	5
English (Second Language)	4
Interrater Reliability	4
Scores	4
Scoring	4
Comparative Analysis	3
Language Proficiency	3
Secondary School Students	3
Computer Assisted Testing	2
Elementary School Students	2
Essays	2
Formative Evaluation	2
Generalization	2
High Stakes Tests	2
Item Analysis	2
Prompting	2
Reliability	2
Test Construction	2
Test Items	2
Validity	2
More ▼

Source

Language Testing	3
Assessment in Education:…	1
ETS Research Report Series	1
Educational Research	1
International Journal of…	1
JALT CALL Journal	1
Language Assessment Quarterly	1
Scandinavian Journal of…	1

Publication Type

Journal Articles	10
Reports - Research	10
Tests/Questionnaires	1

Education Level

Higher Education	3
Postsecondary Education	3
Secondary Education	3
Elementary Education	2

Audience

Location

Norway	2
China	1
Colombia	1
Europe	1
Germany	1
Iran	1
Netherlands	1
United Kingdom (England)	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…

What Works Clearinghouse Rating

Showing all 10 results Save | Export

The Longitudinal Stability of Rating Characteristics in an EFL Examination: Methodological and Substantive Considerations

Peer reviewed

Direct link

Lamprianou, Iasonas; Tsagari, Dina; Kyriakou, Nansia – Language Testing, 2021

This longitudinal study (2002-2014) investigates the stability of rating characteristics of a large group of raters over time in the context of the writing paper of a national high-stakes examination. The study uses one measure of rater severity and two measures of rater consistency. The results suggest that the rating characteristics of…

Descriptors: Longitudinal Studies, Evaluators, High Stakes Tests, Writing Evaluation

Examining Severity and Centrality Effects in TestDaF Writing and Speaking Assessments: An Extended Bayesian Many-Facet Rasch Analysis

Peer reviewed

Direct link

Eckes, Thomas; Jin, Kuan-Yu – International Journal of Testing, 2021

Severity and centrality are two main kinds of rater effects posing threats to the validity and fairness of performance assessments. Adopting Jin and Wang's (2018) extended facets modeling approach, we separately estimated the magnitude of rater severity and centrality effects in the web-based TestDaF (Test of German as a Foreign Language) writing…

Descriptors: Language Tests, German, Second Languages, Writing Tests

"Digging for Gold" or "Sticking to the Criteria": Teachers' Rationales When Serving as Professional Raters

Peer reviewed

Direct link

Jølle, Lennart; Skar, Gustaf B. – Scandinavian Journal of Educational Research, 2020

This paper reports findings from a project called "The National Panel of Raters" (NPR) that took place within a writing test programme in Norway (2010-2016). A recent research project found individual differences between the raters in the NPR. This paper reports results from an explorative follow up-study where 63 NPR members were…

Descriptors: Foreign Countries, Validity, Scoring, Program Descriptions

Measuring the Impact of Rater Negotiation in Writing Performance Assessment

Peer reviewed

Direct link

Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017

Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…

Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators

Establishing Comparability across Writing Tasks with Picture Prompts of Three Alternate Tests

Peer reviewed

Direct link

Li, Jiuliang – Language Assessment Quarterly, 2018

In language testing programs, different test forms are often used to administer the same test. Demonstrating the comparability of these forms is essential to avoid criticisms of potential test unfairness. However, studies with this objective are scarce. This study aims to investigate the extent to which the picture-prompt writing tasks of three…

Descriptors: Writing Tests, Language Tests, Check Lists, Culture Fair Tests

Rater Strategies for Reaching Agreement on Pupil Text Quality

Peer reviewed

Direct link

Jølle, Lennart – Assessment in Education: Principles, Policy & Practice, 2015

Novice members of a Norwegian national rater panel tasked with assessing Year 8 pupils' written texts were studied during three successive preparation sessions (2011-2012). The purpose was to investigate how the raters successfully make use of different decision-making strategies in an assessment situation where pre-set criteria and standards give…

Descriptors: Interrater Reliability, Writing Evaluation, Decision Making, Novices

Effect of Genre on the Generalizability of Writing Scores

Peer reviewed

Direct link

Bouwer, Renske; Béguin, Anton; Sanders, Ted; van den Bergh, Huub – Language Testing, 2015

In the present study, aspects of the measurement of writing are disentangled in order to investigate the validity of inferences made on the basis of writing performance and to describe implications for the assessment of writing. To include genre as a facet in the measurement, we obtained writing scores of 12 texts in four different genres for each…

Descriptors: Writing Tests, Generalization, Scores, Writing Instruction

A Comparative Analysis of Face to Face Instruction vs. Telegram Mobile Instruction in Terms of Narrative Writing

Peer reviewed
PDF on ERIC

Download full text

Heidari, Jamshid; Khodabandeh, Farzaneh; Soleimani, Hassan – JALT CALL Journal, 2018

The emergence of computer technology in English language teaching has paved the way for teachers' application of Mobile Assisted Language Learning (mall) and its advantages in teaching. This study aimed to compare the effectiveness of the face to face instruction with Telegram mobile instruction. Based on a toefl test, 60 English foreign language…

Descriptors: Comparative Analysis, Conventional Instruction, Teaching Methods, Computer Assisted Instruction

An Investigation of the Reliability of Marking of the Key Stage 2 National Curriculum English Writing Tests in England

Peer reviewed

Direct link

He, Qingping; Anwyll, Steve; Glanville, Matthew; Deavall, Angela – Educational Research, 2013

Background: Although there has been considerable research into the reliability of marking for the Key Stage 3 (KS3) National Curriculum tests (NCTs) and public examinations such as the General Certificate of Secondary Education examinations (GCSEs) in England, little is understood about the level of reliability of marking of the Key Stage 2 (KS2)…

Descriptors: National Curriculum, Foreign Countries, Writing Skills, Writing Tests

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Jølle, Lennart	2
Anwyll, Steve	1
Bouwer, Renske	1
Breyer, F. Jay	1
Béguin, Anton	1
Deavall, Angela	1
Eckes, Thomas	1
Glanville, Matthew	1
He, Qingping	1
Heidari, Jamshid	1
Janssen, Gerriet	1
Jin, Kuan-Yu	1
Khodabandeh, Farzaneh	1
Kyriakou, Nansia	1
Lamprianou, Iasonas	1
Li, Jiuliang	1
Lorenz, Florian	1
Meier, Valerie	1
Sanders, Ted	1
Skar, Gustaf B.	1
Soleimani, Hassan	1
Trace, Jonathan	1
Tsagari, Dina	1
Zhang, Mo	1
van den Bergh, Huub	1
More ▼