ERIC - Search Results

Publication Date

In 2025	1
Since 2024	3
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	10
Since 2006 (last 20 years)	12

Descriptor

Writing Evaluation	14
Evaluators	9
Second Language Learning	9
English (Second Language)	8
Interrater Reliability	8
Essays	6
Rating Scales	6
Writing Tests	6
Foreign Countries	4
Language Tests	4
Scores	4
Scoring	4
Comparative Analysis	3
Computational Linguistics	3
Evaluation Methods	3
Item Response Theory	3
Language Usage	3
Placement Tests	3
Reliability	3
Test Reliability	3
Certification	2
College Students	2
Correlation	2
Decision Making	2
Essay Tests	2
More ▼

Source

Language Testing

Publication Type

Journal Articles	14
Reports - Research	11
Reports - Evaluative	2
Tests/Questionnaires	2
Information Analyses	1
Reports - Descriptive	1

Education Level

Higher Education	4
Secondary Education	3
Postsecondary Education	2
High Schools	1
Junior High Schools	1
Middle Schools	1

Audience

Location

Austria	1
Europe	1
Hawaii	1
Japan	1
Netherlands	1
Ohio	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	1
Test of Written English	1

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Triangulating Natural Language Processing (NLP)-Based Analysis of Rater Comments and Many-Facet Rasch Measurement (MFRM): An Innovative Approach to Investigating Raters' Application of Rating Scales in Writing Assessment

Peer reviewed

Direct link

Huiying Cai; Xun Yan – Language Testing, 2024

Rater comments tend to be qualitatively analyzed to indicate raters' application of rating scales. This study applied natural language processing (NLP) techniques to quantify meaningful, behavioral information from a corpus of rater comments and triangulated that information with a many-facet Rasch measurement (MFRM) analysis of rater scores. The…

Descriptors: Natural Language Processing, Item Response Theory, Rating Scales, Writing Evaluation

Comparative Judgement for Evaluating Young Learners' EFL Writing Performances: Reliability and Teacher Perceptions of Holistic and Dimension-Based Judgements

Peer reviewed

Direct link

Rebecca Sickinger; Tineke Brunfaut; John Pill – Language Testing, 2025

Comparative Judgement (CJ) is an evaluation method, typically conducted online, whereby a rank order is constructed, and scores calculated, from judges' pairwise comparisons of performances. CJ has been researched in various educational contexts, though only rarely in English as a Foreign Language (EFL) writing settings, and is generally agreed to…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

More Efficient Processes for Creating Automated Essay Scoring Frameworks: A Demonstration of Two Algorithms

Peer reviewed

Direct link

Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021

Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…

Descriptors: Scoring, Essays, Writing Evaluation, Computer Software

"How Do Raters Learn to Rate?" Many-Facet Rasch Modeling of Rater Performance over the Course of a Rater Certification Program

Peer reviewed

Direct link

Yan, Xun; Chuang, Ping-Lin – Language Testing, 2023

This study employed a mixed-methods approach to examine how rater performance develops during a semester-long rater certification program for an English as a Second Language (ESL) writing placement test at a large US university. From 2016 to 2018, we tracked three groups of novice raters (n = 30) across four rounds in the certification program.…

Descriptors: Evaluators, Interrater Reliability, Item Response Theory, Certification

Automated Scoring of Junior and Senior High Essays Using Coh-Metrix Features: Implications for Large-Scale Language Testing

Peer reviewed

Direct link

Latifi, Syed; Gierl, Mark – Language Testing, 2021

An automated essay scoring (AES) program is a software system that uses techniques from corpus and computational linguistics and machine learning to grade essays. In this study, we aimed to describe and evaluate particular language features of Coh-Metrix for a novel AES program that would score junior and senior high school students' essays from…

Descriptors: Writing Evaluation, Computer Assisted Testing, Scoring, Essays

The Longitudinal Stability of Rating Characteristics in an EFL Examination: Methodological and Substantive Considerations

Peer reviewed

Direct link

Lamprianou, Iasonas; Tsagari, Dina; Kyriakou, Nansia – Language Testing, 2021

This longitudinal study (2002-2014) investigates the stability of rating characteristics of a large group of raters over time in the context of the writing paper of a national high-stakes examination. The study uses one measure of rater severity and two measures of rater consistency. The results suggest that the rating characteristics of…

Descriptors: Longitudinal Studies, Evaluators, High Stakes Tests, Writing Evaluation

Functional Adequacy in L2 Writing: Towards a New Rating Scale

Peer reviewed

Direct link

Kuiken, Folkert; Vedder, Ineke – Language Testing, 2017

The importance of functional adequacy as an essential component of L2 proficiency has been observed by several authors (Pallotti, 2009; De Jong, Steinel, Florijn, Schoonen, & Hulstijn, 2012a, b). The rationale underlying the present study is that the assessment of writing proficiency in L2 is not fully possible without taking into account the…

Descriptors: Second Language Learning, Rating Scales, Computational Linguistics, Persuasive Discourse

A Comparison of Newly-Trained and Experienced Raters on a Standardized Writing Assessment

Peer reviewed

Direct link

Attali, Yigal – Language Testing, 2016

A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…

Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators

Grounding Lexical Diversity in Human Judgments

Peer reviewed

Direct link

Jarvis, Scott – Language Testing, 2017

The present study discusses the relevance of measures of lexical diversity (LD) to the assessment of learner corpora. It also argues that existing measures of LD, many of which have become specialized for use with language corpora, are fundamentally measures of lexical repetition, are based on an etic perspective of language, and lack construct…

Descriptors: Computational Linguistics, English (Second Language), Second Language Learning, Native Speakers

Rater Bias Patterns in an EFL Writing Assessment

Peer reviewed

Direct link

Schaefer, Edward – Language Testing, 2008

The present study employed multi-faceted Rasch measurement (MFRM) to explore the rater bias patterns of native English-speaker (NES) raters when they rate EFL essays. Forty NES raters rated 40 essays written by female Japanese university students on a single topic adapted from the TOEFL Test of Written English (TWE). The essays were assessed using…

Descriptors: Writing Evaluation, Writing Tests, Program Effectiveness, Essays

Evaluating Rater Responses to an Online Training Program for L2 Writing Assessment

Peer reviewed

Direct link

Elder, Catherine; Barkhuizen, Gary; Knoch, Ute; von Randow, Janet – Language Testing, 2007

The use of online rater self-training is growing in popularity and has obvious practical benefits, facilitating access to training materials and rating samples and allowing raters to reorient themselves to the rating scale and self monitor their behaviour at their own convenience. However there has thus far been little research into rater…

Descriptors: Writing Evaluation, Writing Tests, Scoring Rubrics, Rating Scales

The Assessment of Writing Ability: Expert Readers versus Lay Readers.

Peer reviewed

Schoonen, Rob; And Others – Language Testing, 1997

Reports on three studies conducted in the Netherlands about the reading reliability of lay and expert readers in rating content and language usage of students' writing performances in three kinds of writing assignments. Findings reveal that expert readers are more reliable in rating usage, whereas both lay and expert readers are reliable raters of…

Descriptors: Foreign Countries, Interrater Reliability, Language Usage, Models

A Long-Term Research Agenda for the Test of Written English.

Peer reviewed

Stansfield, Charles W.; Ross, Jacqueline – Language Testing, 1988

Outlines research necessary for determining the validity and reliability of Test of Written English, an essay test that directly measures writing ability and complements Test of English-as-a-Foreign-Language's (TOEFL) indirect assessment of writing skills. Research should cover such aspects as construct, criterion-related, concurrent, content, and…

Descriptors: English (Second Language), Essay Tests, Language Research, Language Tests

Ann Tai Choe	1
Attali, Yigal	1
Barkhuizen, Gary	1
Chuang, Ping-Lin	1
Daniel Holden	1
Daniel R. Isbell	1
Elder, Catherine	1
Gierl, Mark	1
Gierl, Mark J.	1
Huiying Cai	1
Jarvis, Scott	1
John Pill	1
Knoch, Ute	1
Kuiken, Folkert	1
Kyriakou, Nansia	1
Lamprianou, Iasonas	1
Latifi, Syed	1
Rebecca Sickinger	1
Ross, Jacqueline	1
Schaefer, Edward	1
Schoonen, Rob	1
Shin, Jinnie	1
Stansfield, Charles W.	1
Tineke Brunfaut	1
More ▼