ERIC - Search Results

Publication Date

In 2025	1
Since 2024	4
Since 2021 (last 5 years)	7
Since 2016 (last 10 years)	18
Since 2006 (last 20 years)	29

Descriptor

Comparative Analysis	34
Essays	34
Writing Evaluation	20
Interrater Reliability	19
English (Second Language)	18
Foreign Countries	18
Second Language Learning	16
Reliability	15
Scoring	15
Evaluators	14
Computer Assisted Testing	10
Computer Software	10
Correlation	10
Second Language Instruction	10
Scores	8
Validity	8
Writing Tests	8
Statistical Analysis	7
Grading	6
Student Evaluation	6
Feedback (Response)	5
High School Students	5
Language Tests	5
Models	5
Scoring Rubrics	5
More ▼

Publication Type

Journal Articles	26
Reports - Research	26
Reports - Evaluative	3
Reports - Descriptive	2
Tests/Questionnaires	2
Books	1
Dissertations/Theses -…	1
Guides - Non-Classroom	1
Information Analyses	1
Numerical/Quantitative Data	1
Speeches/Meeting Papers	1
More ▼

Education Level

Higher Education	12
Postsecondary Education	11
Secondary Education	5
High Schools	4
Elementary Secondary Education	3
Grade 11	2
Grade 12	1

Audience

Practitioners	2
Teachers	2

Location

China	5
United Kingdom (England)	3
Australia	2
Connecticut	2
Germany	2
Iran	2
New Hampshire	2
New York	2
Rhode Island	2
Vermont	2
California	1
Hong Kong	1
Nigeria	1
Philippines	1
Singapore	1
Sweden	1
Taiwan	1
Turkey	1
West Virginia	1
More ▼

Laws, Policies, & Programs

Every Student Succeeds Act…

Assessments and Surveys

Graduate Record Examinations	2
National Assessment of…	2
New York State Regents…	2
Test of English as a Foreign…	2
SAT (College Admission Test)	1

What Works Clearinghouse Rating

Showing 1 to 15 of 34 results Save | Export

Coherence-Based Automatic Short Answer Scoring Using Sentence Embedding

Peer reviewed

Direct link

Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024

Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…

Descriptors: Scoring, Essays, Writing Evaluation, Memory

Utilizing Large Language Models for EFL Essay Grading: An Examination of Reliability and Validity in Rubric-Based Assessments

Peer reviewed

Direct link

Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025

This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics

More Efficient Processes for Creating Automated Essay Scoring Frameworks: A Demonstration of Two Algorithms

Peer reviewed

Direct link

Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021

Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…

Descriptors: Scoring, Essays, Writing Evaluation, Computer Software

Judges' Views on Pairwise Comparative Judgement and Rank Ordering as Alternatives to Analytical Essay Marking

Download full text

Walland, Emma – Research Matters, 2022

In this article, I report on examiners' views and experiences of using Pairwise Comparative Judgement (PCJ) and Rank Ordering (RO) as alternatives to traditional analytical marking for GCSE English Language essays. Fifteen GCSE English Language examiners took part in the study. After each had judged 100 pairs of essays using PCJ and eight packs of…

Descriptors: Essays, Grading, Writing Evaluation, Evaluators

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Depth-Perception-Based Representation in Holistic Rating on ESL Essay Writing

Peer reviewed

Direct link

Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024

This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…

Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy

Impacts of ChatGPT-Assisted Writing for EFL English Majors: Feasibility and Challenges

Peer reviewed

Direct link

Chung-You Tsai; Yi-Ti Lin; Iain Kelsall Brown – Education and Information Technologies, 2024

To determine the impacts of using ChatGPT to assist English as a foreign language (EFL) English college majors in revising essays and the possibility of leading to higher scores and potentially causing unfairness. A prospective, double-blinded, paired-comparison study was conducted in Feb. 2023. A total of 44 students provided 44 original essays…

Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, English (Second Language)

Writing Scale Effects on Raters: An Exploratory Study

Peer reviewed

Direct link

Jeong, Heejeong – Language Testing in Asia, 2019

In writing assessment, finding a valid, reliable, and efficient scale is critical. Appropriate scales, increase rater reliability, and can also save time and money. This exploratory study compared the effects of a binary scale and an analytic scale across teacher raters and expert raters. The purpose of the study is to find out how different scale…

Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction

Effect of Quality Characteristics of Peer Raters on Rating Errors in Peer Assessment

Peer reviewed

Direct link

Guo, Xiuyan; Lei, Pui-Wa – International Journal of Testing, 2020

Little research has been done on the effects of peer raters' quality characteristics on peer rating qualities. This study aims to address this gap and investigate the effects of key variables related to peer raters' qualities, including content knowledge, previous rating experience, training on rating tasks, and rating motivation. In an experiment…

Descriptors: Peer Evaluation, Error Patterns, Correlation, Knowledge Level

The Impact of Rater Variability on Relationships among Different Effect-Size Indices for Inter-Rater Agreement between Human and Automated Essay Scoring

Direct link

Yun, Jiyeo – ProQuest LLC, 2017

Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…

Descriptors: Interrater Reliability, Essays, Scoring, Evaluators

Differences in Less Proficient and More Proficient ESL College Writing in the Philippine Setting

Download full text

Gustilo, Leah E. – Online Submission, 2016

The present study aimed at characterizing what skilled or more proficient ESL college writing is in the Philippine setting through a contrastive analysis of three groups of variables identified from previous studies: resources, processes, and performance of ESL writers. Based on Chenoweth and Hayes' (2001; 2003) framework, the resource level…

Descriptors: Language Proficiency, English (Second Language), Second Language Learning, Foreign Countries

Evaluating Comparative Judgment as an Approach to Essay Scoring

Peer reviewed

Direct link

Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016

As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…

Descriptors: Essays, Scoring, Comparative Analysis, Evaluators

Does Indirect Writing Assessment Have Any Relevance to Direct Writing Assessment? Focus on Validity and Reliability

Peer reviewed
PDF on ERIC

Download full text

Kural, Faruk – Journal of Language and Linguistic Studies, 2018

The present paper, which is a study based on midterm exam results of 53 University English prep-school students, examines correlation between a direct writing test, measured holistically by multiple-trait scoring, and two indirect writing tests used in a competence exam, one of which is a multiple-choice cloze test and the other a rewrite test…

Descriptors: Writing Evaluation, Cloze Procedure, Comparative Analysis, Essays

A Comparison of Newly-Trained and Experienced Raters on a Standardized Writing Assessment

Peer reviewed

Direct link

Attali, Yigal – Language Testing, 2016

A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…

Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators

A Comparative Analysis of Face to Face Instruction vs. Telegram Mobile Instruction in Terms of Narrative Writing

Peer reviewed
PDF on ERIC

Download full text

Heidari, Jamshid; Khodabandeh, Farzaneh; Soleimani, Hassan – JALT CALL Journal, 2018

The emergence of computer technology in English language teaching has paved the way for teachers' application of Mobile Assisted Language Learning (mall) and its advantages in teaching. This study aimed to compare the effectiveness of the face to face instruction with Telegram mobile instruction. Based on a toefl test, 60 English foreign language…

Descriptors: Comparative Analysis, Conventional Instruction, Teaching Methods, Computer Assisted Instruction

Previous Page | Next Page »

Pages: 1 | 2 | 3

British Journal of…	2
ETS Research Report Series	2
Language Testing	2
Action in Teacher Education	1
Applied Measurement in…	1
Assessing Writing	1
CALICO Journal	1
College Entrance Examination…	1
Council of Chief State School…	1
Education and Information…	1
Educational Research and…	1
English Language Teaching	1
English Teaching	1
European Journal of Education	1
International Journal of…	1
Iranian Journal of Language…	1
JALT CALL Journal	1
Journal of Computer Assisted…	1
Journal of Educational…	1
Journal of Language and…	1
Language Testing in Asia	1
Learning Policy Institute	1
Online Submission	1
ProQuest LLC	1
Psychology Learning and…	1
More ▼

Attali, Yigal	2
Darling-Hammond, Linda	2
Lenhard, Wolfgang	2
Seifried, Eva	2
Spinath, Birgit	2
Akinwamide, Timothy Kolade	1
Baier, Herbert	1
Bell, John F.	1
Breyer, F. Jay	1
Camara, Wayne J.	1
Chase, Clinton I.	1
Chung-You Tsai	1
Coniam, David	1
Dadi Ramesh	1
Fatih Yavuz	1
Ferrara, Steve	1
Gamze Yavas Çelik	1
Gierl, Mark J.	1
Guo, Xiuyan	1
Gustilo, Leah E.	1
Harrington, M.	1
Heidari, Jamshid	1
Hixson, Nate	1
Iain Kelsall Brown	1
Jacobs, Lucy Cheser	1
More ▼