Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 6 |
Since 2006 (last 20 years) | 8 |
Descriptor
Comparative Analysis | 8 |
Evaluators | 8 |
Writing Tests | 8 |
Essays | 6 |
English (Second Language) | 5 |
Language Tests | 5 |
Scoring | 5 |
Second Language Learning | 5 |
Correlation | 4 |
Writing Evaluation | 4 |
Computer Assisted Testing | 3 |
More ▼ |
Source
Applied Measurement in… | 2 |
ETS Research Report Series | 2 |
JALT CALL Journal | 1 |
Language Assessment Quarterly | 1 |
Language Teaching Research… | 1 |
Language Testing | 1 |
Author
Attali, Yigal | 3 |
Breyer, F. Jay | 1 |
Buzick, Heather | 1 |
Ferrara, Steve | 1 |
Flor, Michael | 1 |
Heidari, Jamshid | 1 |
Khodabandeh, Farzaneh | 1 |
Li, Jiuliang | 1 |
Lorenz, Florian | 1 |
Oliveri, Maria Elena | 1 |
Osama Koraishi | 1 |
More ▼ |
Publication Type
Journal Articles | 8 |
Reports - Research | 8 |
Tests/Questionnaires | 1 |
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 2 |
Graduate Record Examinations | 1 |
International English… | 1 |
What Works Clearinghouse Rating
Osama Koraishi – Language Teaching Research Quarterly, 2024
This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Artificial Intelligence
Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016
As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…
Descriptors: Essays, Scoring, Comparative Analysis, Evaluators
Li, Jiuliang – Language Assessment Quarterly, 2018
In language testing programs, different test forms are often used to administer the same test. Demonstrating the comparability of these forms is essential to avoid criticisms of potential test unfairness. However, studies with this objective are scarce. This study aims to investigate the extent to which the picture-prompt writing tasks of three…
Descriptors: Writing Tests, Language Tests, Check Lists, Culture Fair Tests
Attali, Yigal – Language Testing, 2016
A short training program for evaluating responses to an essay writing task consisted of scoring 20 training essays with immediate feedback about the correct score. The same scoring session also served as a certification test for trainees. Participants with little or no previous rating experience completed this session and 14 trainees who passed an…
Descriptors: Writing Evaluation, Writing Tests, Standardized Tests, Evaluators
Buzick, Heather; Oliveri, Maria Elena; Attali, Yigal; Flor, Michael – Applied Measurement in Education, 2016
Automated essay scoring is a developing technology that can provide efficient scoring of large numbers of written responses. Its use in higher education admissions testing provides an opportunity to collect validity and fairness evidence to support current uses and inform its emergence in other areas such as K-12 large-scale assessment. In this…
Descriptors: Essays, Learning Disabilities, Attention Deficit Hyperactivity Disorder, Scoring
Heidari, Jamshid; Khodabandeh, Farzaneh; Soleimani, Hassan – JALT CALL Journal, 2018
The emergence of computer technology in English language teaching has paved the way for teachers' application of Mobile Assisted Language Learning (mall) and its advantages in teaching. This study aimed to compare the effectiveness of the face to face instruction with Telegram mobile instruction. Based on a toefl test, 60 English foreign language…
Descriptors: Comparative Analysis, Conventional Instruction, Teaching Methods, Computer Assisted Instruction
Attali, Yigal; Sinharay, Sandip – ETS Research Report Series, 2015
The "e-rater"® automated essay scoring system is used operationally in the scoring of "TOEFL iBT"® independent and integrated tasks. In this study we explored the psychometric added value of reporting four trait scores for each of these two tasks, beyond the total e-rater score.The four trait scores are word choice, grammatical…
Descriptors: Writing Tests, Scores, Language Tests, English (Second Language)
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests