ERIC - Search Results

Publication Date

In 2025	1
Since 2024	7
Since 2021 (last 5 years)	22
Since 2016 (last 10 years)	58
Since 2006 (last 20 years)	107

Descriptor

Essays	134
Interrater Reliability	71
Writing Evaluation	71
Scoring	54
Foreign Countries	53
Reliability	51
Second Language Learning	43
English (Second Language)	42
Comparative Analysis	26
Second Language Instruction	26
Computer Assisted Testing	24
Correlation	24
Validity	24
Writing Tests	24
Scores	23
Evaluators	22
Test Reliability	21
College Students	20
Computer Software	20
Evaluation Methods	20
Higher Education	20
Scoring Rubrics	20
Grading	19
Statistical Analysis	19
Automation	18
More ▼

Publication Type

Journal Articles	134
Reports - Research	102
Reports - Evaluative	18
Tests/Questionnaires	13
Reports - Descriptive	8
Opinion Papers	6
Speeches/Meeting Papers	2
Guides - General	1
Information Analyses	1
Non-Print Media	1

Education Level

Higher Education	55
Postsecondary Education	42
Secondary Education	13
Elementary Education	7
Middle Schools	7
High Schools	6
Grade 8	5
Junior High Schools	4
Elementary Secondary Education	3
Grade 11	2
Grade 12	2
Grade 4	2
Adult Education	1
Early Childhood Education	1
Grade 10	1
Grade 3	1
Grade 5	1
Grade 6	1
Grade 7	1
Intermediate Grades	1
Primary Education	1
Two Year Colleges	1
More ▼

Audience

Practitioners	3
Teachers	3

Location

China	8
Iran	7
Turkey	5
Germany	3
Hong Kong	3
United Kingdom (England)	3
Australia	2
Delaware	2
Indonesia	2
New Jersey	2
Taiwan	2
United Kingdom	2
Alabama	1
California	1
Hawaii	1
Israel	1
Japan	1
Malaysia (Kuala Lumpur)	1
Massachusetts	1
Minnesota	1
Netherlands	1
Nigeria	1
Ohio	1
Philippines	1
Russia	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	4
Graduate Record Examinations	3
Advanced Placement…	2
National Assessment of…	2
COMPASS (Computer Assisted…	1
Graduate Management Admission…	1
International English…	1
SAT (College Admission Test)	1
Test of Standard Written…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 134 results Save | Export

A Data-Driven Approach for the Identification of Features for Automated Feedback on Academic Essays

Peer reviewed

Direct link

Abbas, Mohsin; van Rosmalen, Peter; Kalz, Marco – IEEE Transactions on Learning Technologies, 2023

For predicting and improving the quality of essays, text analytic metrics (surface, syntactic, morphological, and semantic features) can be used to provide formative feedback to the students in higher education. In this study, the goal was to identify a sufficient number of features that exhibit a fair proxy of the scores given by the human raters…

Descriptors: Feedback (Response), Automation, Essays, Scoring

Coherence-Based Automatic Short Answer Scoring Using Sentence Embedding

Peer reviewed

Direct link

Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024

Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…

Descriptors: Scoring, Essays, Writing Evaluation, Memory

Peer Overmarking and Insufficient Diagnosticity: The Impact of the Rating Method for Peer Assessment

Peer reviewed

Direct link

Van Meenen, Florence; Coertjens, Liesje; Van Nes, Marie-Claire; Verschuren, Franck – Advances in Health Sciences Education, 2022

The present study explores two rating methods for peer assessment (analytical rating using criteria and comparative judgement) in light of concurrent validity, reliability and insufficient diagnosticity (i.e. the degree to which substandard work is recognised by the peer raters). During a second-year undergraduate course, students wrote a one-page…

Descriptors: Evaluation Methods, Peer Evaluation, Accuracy, Evaluation Criteria

Examining the Calibration Process for Raters of the "GRE"® General Test. ETS GRE® Board Research Report. GRE®-19-01. Research Report Series. ETS RR-19-09

Peer reviewed
PDF on ERIC

Download full text

Wendler, Cathy; Glazer, Nancy; Cline, Frederick – ETS Research Report Series, 2019

One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as…

Descriptors: College Entrance Examinations, Graduate Study, Accuracy, Test Reliability

A Novel Automated Essay Scoring Approach for Reliable Higher Educational Assessments

Peer reviewed

Direct link

Beseiso, Majdi; Alzubi, Omar A.; Rashaideh, Hasan – Journal of Computing in Higher Education, 2021

E-learning is gradually gaining prominence in higher education, with universities enlarging provision and more students getting enrolled. The effectiveness of automated essay scoring (AES) is thus holding a strong appeal to universities for managing an increasing learning interest and reducing costs associated with human raters. The growth in…

Descriptors: Automation, Scoring, Essays, Writing Tests

Investigating the Quality of a High-Stakes EFL Writing Assessment Procedure in the Turkish Higher Education Context

Peer reviewed
PDF on ERIC

Download full text

Elif Sari – International Journal of Assessment Tools in Education, 2024

Employing G-theory and rater interviews, the study investigated how a high-stakes writing assessment procedure (i.e., a single-task, single-rater, and holistic scoring procedure) impacted the variability and reliability of its scores within the Turkish higher education context. Thirty-two essays written on two different writing tasks (i.e.,…

Descriptors: Foreign Countries, High Stakes Tests, Writing Evaluation, Scores

Utilizing Large Language Models for EFL Essay Grading: An Examination of Reliability and Validity in Rubric-Based Assessments

Peer reviewed

Direct link

Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025

This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics

Making Each Point Count: Revising a Local Adaptation of the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE Rubric

Peer reviewed

Direct link

Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024

In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…

Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)

Automated Essay Scoring and the Deep Learning Black Box: How Are Rubric Scores Determined?

Peer reviewed

Direct link

Kumar, Vivekanandan S.; Boulanger, David – International Journal of Artificial Intelligence in Education, 2021

This article investigates the feasibility of using automated scoring methods to evaluate the quality of student-written essays. In 2012, Kaggle hosted an Automated Student Assessment Prize contest to find effective solutions to automated testing and grading. This article: a) analyzes the datasets from the contest -- which contained hand-graded…

Descriptors: Automation, Scoring, Essays, Writing Evaluation

The Different Impact of a Structured Peer-Assessment Task in Relation to University Undergraduates' Initial Writing Skills

Peer reviewed

Direct link

Ramon-Casas, Marta; Nuño, Neus; Pons, Ferran; Cunillera, Toni – Assessment & Evaluation in Higher Education, 2019

This article presents an empirical evaluation of the validity and reliability of a peer-assessment activity to improve academic writing competences. Specifically, we explored a large group of psychology undergraduate students with different initial writing skills. Participants (n = 365) produced two different essays, which were evaluated by their…

Descriptors: Peer Evaluation, Validity, Reliability, Writing Skills

Correlating What We Know: A Mixed Methods Study of Reflection and Writing in First-Year Writing Assessment

Peer reviewed

Direct link

Pruchnic, Jeff; Barton, Ellen; Primeau, Sarah; Trimble, Thomas; Varty, Nicole; Foster, Tanina – Composition Forum, 2021

Over the past two decades, reflective writing has occupied an increasingly prominent position in composition theory, pedagogy, and assessment as researchers have described the value of reflection and reflective writing in college students' development of higher-order writing skills, such as genre conventions (Yancey, "Reflection";…

Descriptors: Reflection, Correlation, Essays, Freshman Composition

More Efficient Processes for Creating Automated Essay Scoring Frameworks: A Demonstration of Two Algorithms

Peer reviewed

Direct link

Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021

Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…

Descriptors: Scoring, Essays, Writing Evaluation, Computer Software

Judges' Views on Pairwise Comparative Judgement and Rank Ordering as Alternatives to Analytical Essay Marking

Download full text

Walland, Emma – Research Matters, 2022

In this article, I report on examiners' views and experiences of using Pairwise Comparative Judgement (PCJ) and Rank Ordering (RO) as alternatives to traditional analytical marking for GCSE English Language essays. Fifteen GCSE English Language examiners took part in the study. After each had judged 100 pairs of essays using PCJ and eight packs of…

Descriptors: Essays, Grading, Writing Evaluation, Evaluators

Examining Consistency among Different Rubrics for Assessing Writing

Peer reviewed

Direct link

Shabani, Enayat A.; Panahi, Jaleh – Language Testing in Asia, 2020

The literature on using scoring rubrics in writing assessment denotes the significance of rubrics as practical and useful means to assess the quality of writing tasks. This study tries to investigate the agreement among rubrics endorsed and used for assessing the essay writing tasks by the internationally recognized tests of English language…

Descriptors: Writing Evaluation, Scoring Rubrics, Scores, Interrater Reliability

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

ETS Research Report Series	7
Journal of Technology,…	6
Language Testing	6
Journal of Educational…	5
Assessing Writing	4
Language Assessment Quarterly	4
Language Testing in Asia	4
Applied Measurement in…	3
International Journal of…	3
Advances in Language and…	2
Assessment and Evaluation in…	2
British Journal of…	2
College Teaching	2
Education and Information…	2
Educational and Psychological…	2
IEEE Transactions on Learning…	2
International Journal of…	2
Journal of Language and…	2
Teaching at a Distance	2
Action in Teacher Education	1
Advances in Health Sciences…	1
American Educational Research…	1
Applied Linguistics	1
Assessment & Evaluation in…	1
Assessment in Higher Education	1
More ▼

Attali, Yigal	4
Kantor, Robert	3
Lee, Yong-Won	3
Ben-Simon, Anat	2
Coniam, David	2
Crossley, Scott A.	2
Deane, Paul	2
Gentile, Claudia	2
Johnson, Robert L.	2
Lenhard, Wolfgang	2
McNamara, Danielle S.	2
Powers, Donald E.	2
Seifried, Eva	2
Spinath, Birgit	2
Zhang, Mo	2
Abbas, Mohsin	1
Abedi, Jamal	1
Adiarta, Agus	1
Akinwamide, Timothy Kolade	1
Alexander, R. Curby	1
Alzubi, Omar A.	1
Andersen, Richard	1
Ann Tai Choe	1
Baier, Herbert	1
More ▼