ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	12

Descriptor

Evaluators	13
Scores	13
Writing Tests	13
Second Language Learning	8
Writing Evaluation	8
Language Tests	7
English (Second Language)	6
Scoring	6
Essays	5
Computer Assisted Testing	4
Correlation	4
Foreign Countries	4
Comparative Analysis	3
Essay Tests	3
Interrater Reliability	3
Prompting	3
Standardized Tests	3
Elementary School Students	2
Error Patterns	2
Evaluation Criteria	2
Factor Analysis	2
Feedback (Response)	2
Generalization	2
Item Analysis	2
Item Response Theory	2
More ▼

Source

Language Testing	4
ETS Research Report Series	3
Higher Education Research and…	1
Journal of Effective Teaching	1
Language Testing in Asia	1
Scandinavian Journal of…	1
Studies in Applied…	1

Publication Type

Journal Articles	12
Reports - Research	12
Tests/Questionnaires	2
Reports - Evaluative	1
Speeches/Meeting Papers	1

Education Level

Higher Education	3
Postsecondary Education	3
Elementary Education	2
Secondary Education	1

Audience

Location

Colombia	1
Hawaii	1
Netherlands	1
New York (New York)	1
Norway	1

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	2
International English…	1
National Assessment of…	1

What Works Clearinghouse Rating

Showing all 13 results Save | Export

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

Generalizability of Writing Scores and Language Program Placement Decisions: Score Dependability, Task Variability, and Score Profiles on an ESL Placement Test

Peer reviewed
PDF on ERIC

Download full text

Eskin, Daniel – Studies in Applied Linguistics & TESOL, 2022

For agencies that deliver high-stakes Second Language (L2) proficiency exams, a research agenda has been undertaken for years to examine the role of rater, task, and rubric as sources of variability into their performance assessments (Lee, 2006; Sawaki & Sinharay, 2013; Xi, 2007; Xi & Mollaun, 2006). However, these challenges are more…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Student Placement

"Digging for Gold" or "Sticking to the Criteria": Teachers' Rationales When Serving as Professional Raters

Peer reviewed

Direct link

Jølle, Lennart; Skar, Gustaf B. – Scandinavian Journal of Educational Research, 2020

This paper reports findings from a project called "The National Panel of Raters" (NPR) that took place within a writing test programme in Norway (2010-2016). A recent research project found individual differences between the raters in the NPR. This paper reports results from an explorative follow up-study where 63 NPR members were…

Descriptors: Foreign Countries, Validity, Scoring, Program Descriptions

Measuring the Impact of Rater Negotiation in Writing Performance Assessment

Peer reviewed

Direct link

Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017

Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…

Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators

A Comparison of Newly-Trained and Experienced Raters on a Standardized Writing Assessment

Peer reviewed

Direct link

Attali, Yigal – Language Testing, 2016

A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…

Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators

Effect of Genre on the Generalizability of Writing Scores

Peer reviewed

Direct link

Bouwer, Renske; Béguin, Anton; Sanders, Ted; van den Bergh, Huub – Language Testing, 2015

In the present study, aspects of the measurement of writing are disentangled in order to investigate the validity of inferences made on the basis of writing performance and to describe implications for the assessment of writing. To include genre as a facet in the measurement, we obtained writing scores of 12 texts in four different genres for each…

Descriptors: Writing Tests, Generalization, Scores, Writing Instruction

Automated Trait Scores for "TOEFL"® Writing Tasks. Research Report. ETS RR-15-14

Peer reviewed
PDF on ERIC

Download full text

Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015

The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…

Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)

The Differences in Error Rate and Type between IELTS Writing Bands and Their Impact on Academic Workload

Peer reviewed

Direct link

Müller, Amanda – Higher Education Research and Development, 2015

This paper attempts to demonstrate the differences in writing between International English Language Testing System (IELTS) bands 6.0, 6.5 and 7.0. An analysis of exemplars provided from the IELTS test makers reveals that IELTS 6.0, 6.5 and 7.0 writers can make a minimum of 206 errors, 96 errors and 35 errors per 1000 words. The following section…

Descriptors: English (Second Language), Second Language Learning, Language Tests, Scores

Organization of Ideas in Writing: What Are Raters Sensitive to?

Peer reviewed

Direct link

Ruegg, Rachael; Sugiyama, Yuko – Language Testing in Asia, 2013

Whether foreign language writing is rated using analytic rating scales or holistically, the organization of ideas is invariably one of the aspects assessed. However, it is unclear what raters are sensitive to when rating writing for organization. Which do they value more highly, the physical aspects of organization, such as paragraphing and the…

Descriptors: Second Language Learning, Writing Evaluation, Evaluation Criteria, Writing Tests

TOEFL11: A Corpus of Non-Native English. Research Report. ETS RR-13-24

Peer reviewed
PDF on ERIC

Download full text

Blanchard, Daniel; Tetreault, Joel; Higgins, Derrick; Cahill, Aoife; Chodorow, Martin – ETS Research Report Series, 2013

This report presents work on the development of a new corpus of non-native English writing. It will be useful for the task of native language identification, as well as grammatical error detection and correction, and automatic essay scoring. In this report, the corpus is described in detail.

Descriptors: Language Tests, Second Language Learning, English (Second Language), Writing Tests

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

Evaluating and Improving the Assessment of Undergraduate Student Writing in a School of Business at a Large Regional University

Peer reviewed
PDF on ERIC

Download full text

Glew, David; Meyer, Tracy; Sawyer, Becky; Schuhmann, Pete; Wray, Barry – Journal of Effective Teaching, 2011

Business schools are often criticized for the inadequate writing skills of their graduates. Improving writing skills involves first understanding the current skill level of students. This research attempts to provide insights into the effectiveness of the current method of assessing writing skills in a school of business at a large regional…

Descriptors: Undergraduate Students, Business Administration Education, Business Schools, Writing Skills

Reliability of Professionally Scored Data: NAEP-Related Issues.

Kaplan, Bruce A.; Johnson, Eugene G. – 1992

Across the field of educational assessment the case has been made for alternatives to the multiple-choice item type. Most of the alternative types of items require a subjective evaluation by a rater. The reliability of this subjective rating is a key component of these types of alternative items. In this paper, measures of reliability are…

Descriptors: Educational Assessment, Elementary Secondary Education, Estimation (Mathematics), Evaluators

Attali, Yigal	2
Ann Tai Choe	1
Blanchard, Daniel	1
Bouwer, Renske	1
Breyer, F. Jay	1
Béguin, Anton	1
Cahill, Aoife	1
Chodorow, Martin	1
Daniel Holden	1
Daniel R. Isbell	1
Eskin, Daniel	1
Glew, David	1
Higgins, Derrick	1
Janssen, Gerriet	1
Johnson, Eugene G.	1
Jølle, Lennart	1
Kaplan, Bruce A.	1
Lorenz, Florian	1
Meier, Valerie	1
Meyer, Tracy	1
Müller, Amanda	1
Ruegg, Rachael	1
Sanders, Ted	1
Sawyer, Becky	1
Schuhmann, Pete	1
More ▼