ERIC Number: ED615602
Record Type: Non-Journal
Publication Date: 2021
Pages: 6
Abstractor: As Provided
ISBN: N/A
ISSN: N/A
EISSN: N/A
On the Limitations of Human-Computer Agreement in Automated Essay Scoring
Doewes, Afrizal; Pechenizkiy, Mykola
International Educational Data Mining Society, Paper presented at the International Conference on Educational Data Mining (EDM) (14th, Online, Jun 29-Jul 2, 2021)
Scoring essays is generally an exhausting and time-consuming task for teachers. Automated Essay Scoring (AES) facilitates the scoring process to be faster and more consistent. The most logical way to assess the performance of an automated scorer is by measuring the score agreement with the human raters. However, we provide empirical evidence that a well-performing essay scorer from the quantitative evaluation point of view are still too risky to be deployed. We propose several input scenarios to evaluate the reliability and the validity of the system, such as off-topic essays, gibberish, and paraphrased answers. We demonstrate that automated scoring models with high human-computer agreement fail to perform well on two out of three test scenarios. We also discuss the strategies to improve the performance of the system. [For the full proceedings, see ED615472.]
Descriptors: Man Machine Systems, Automation, Computer Assisted Testing, Scoring, Essays, Reliability, Validity, Scoring Rubrics
International Educational Data Mining Society. e-mail: admin@educationaldatamining.org; Web site: https://educationaldatamining.org/conferences/
Publication Type: Reports - Research; Speeches/Meeting Papers
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A