NotesFAQContact Us
Collection
Advanced
Search Tips
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing 1 to 15 of 86 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Abbas, Mohsin; van Rosmalen, Peter; Kalz, Marco – IEEE Transactions on Learning Technologies, 2023
For predicting and improving the quality of essays, text analytic metrics (surface, syntactic, morphological, and semantic features) can be used to provide formative feedback to the students in higher education. In this study, the goal was to identify a sufficient number of features that exhibit a fair proxy of the scores given by the human raters…
Descriptors: Feedback (Response), Automation, Essays, Scoring
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wendler, Cathy; Glazer, Nancy; Cline, Frederick – ETS Research Report Series, 2019
One of the challenges in scoring constructed-response (CR) items and tasks is ensuring that rater drift does not occur during or across scoring windows. Rater drift reflects changes in how raters interpret and use established scoring criteria to assign essay scores. Calibration is a process used to help control rater drift and, as such, serves as…
Descriptors: College Entrance Examinations, Graduate Study, Accuracy, Test Reliability
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023
Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…
Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Kumar, Vivekanandan S.; Boulanger, David – International Journal of Artificial Intelligence in Education, 2021
This article investigates the feasibility of using automated scoring methods to evaluate the quality of student-written essays. In 2012, Kaggle hosted an Automated Student Assessment Prize contest to find effective solutions to automated testing and grading. This article: a) analyzes the datasets from the contest -- which contained hand-graded…
Descriptors: Automation, Scoring, Essays, Writing Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Ramon-Casas, Marta; Nuño, Neus; Pons, Ferran; Cunillera, Toni – Assessment & Evaluation in Higher Education, 2019
This article presents an empirical evaluation of the validity and reliability of a peer-assessment activity to improve academic writing competences. Specifically, we explored a large group of psychology undergraduate students with different initial writing skills. Participants (n = 365) produced two different essays, which were evaluated by their…
Descriptors: Peer Evaluation, Validity, Reliability, Writing Skills
Peer reviewed Peer reviewed
Direct linkDirect link
Pruchnic, Jeff; Barton, Ellen; Primeau, Sarah; Trimble, Thomas; Varty, Nicole; Foster, Tanina – Composition Forum, 2021
Over the past two decades, reflective writing has occupied an increasingly prominent position in composition theory, pedagogy, and assessment as researchers have described the value of reflection and reflective writing in college students' development of higher-order writing skills, such as genre conventions (Yancey, "Reflection";…
Descriptors: Reflection, Correlation, Essays, Freshman Composition
Peer reviewed Peer reviewed
Direct linkDirect link
Shabani, Enayat A.; Panahi, Jaleh – Language Testing in Asia, 2020
The literature on using scoring rubrics in writing assessment denotes the significance of rubrics as practical and useful means to assess the quality of writing tasks. This study tries to investigate the agreement among rubrics endorsed and used for assessing the essay writing tasks by the internationally recognized tests of English language…
Descriptors: Writing Evaluation, Scoring Rubrics, Scores, Interrater Reliability
Jiyeo Yun – English Teaching, 2023
Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…
Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring
Peer reviewed Peer reviewed
Direct linkDirect link
Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024
This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…
Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy
Ballard, Laura – ProQuest LLC, 2017
Rater scoring has an impact on writing test reliability and validity. Thus, there has been a continued call for researchers to investigate issues related to rating (Crusan, 2015). Investigating the scoring process and understanding how raters arrive at particular scores are critical "because the score is ultimately what will be used in making…
Descriptors: Evaluators, Schemata (Cognition), Eye Movements, Scoring Rubrics
Peer reviewed Peer reviewed
Direct linkDirect link
Chung-You Tsai; Yi-Ti Lin; Iain Kelsall Brown – Education and Information Technologies, 2024
To determine the impacts of using ChatGPT to assist English as a foreign language (EFL) English college majors in revising essays and the possibility of leading to higher scores and potentially causing unfairness. A prospective, double-blinded, paired-comparison study was conducted in Feb. 2023. A total of 44 students provided 44 original essays…
Descriptors: Artificial Intelligence, Computer Software, Technology Uses in Education, English (Second Language)
Peer reviewed Peer reviewed
Direct linkDirect link
Humphry, Stephen Mark; Heldsinger, Sandy – Journal of Educational Measurement, 2019
To capitalize on professional expertise in educational assessment, it is desirable to develop and test methods of rater-mediated assessment that enable classroom teachers to make reliable and informative judgments. Accordingly, this article investigates the reliability of a two-stage method used by classroom teachers to assess primary school…
Descriptors: Essays, Elementary School Students, Writing (Composition), Writing Evaluation
Peer reviewed Peer reviewed
Direct linkDirect link
Ke, Xiaohua; Zeng, Yongqiang; Luo, Haijiao – Journal of Educational Measurement, 2016
This article presents a novel method, the Complex Dynamics Essay Scorer (CDES), for automated essay scoring using complex network features. Texts produced by college students in China were represented as scale-free networks (e.g., a word adjacency model) from which typical network features, such as the in-/out-degrees, clustering coefficient (CC),…
Descriptors: Scoring, Automation, Essays, Networks
Peer reviewed Peer reviewed
Direct linkDirect link
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Peer reviewed Peer reviewed
Direct linkDirect link
He, Tung-hsien – SAGE Open, 2019
This study employed a mixed-design approach and the Many-Facet Rasch Measurement (MFRM) framework to investigate whether rater bias occurred between the onscreen scoring (OSS) mode and the paper-based scoring (PBS) mode. Nine human raters analytically marked scanned scripts and paper scripts using a six-category (i.e., six-criterion) rating…
Descriptors: Computer Assisted Testing, Scoring, Item Response Theory, Essays
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6