Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 13 |
Since 2006 (last 20 years) | 26 |
Descriptor
Source
Author
Coniam, David | 2 |
Gentile, Claudia | 2 |
Kantor, Robert | 2 |
Lee, Yong-Won | 2 |
Attali, Yigal | 1 |
Barton, Ellen | 1 |
Ben-Simon, Anat | 1 |
Berger, Cynthia M. | 1 |
Breland, Hunter M. | 1 |
Breyer, F. Jay | 1 |
Bridgeman, Brent | 1 |
More ▼ |
Publication Type
Journal Articles | 24 |
Reports - Research | 22 |
Tests/Questionnaires | 3 |
Dissertations/Theses -… | 2 |
Reports - Evaluative | 2 |
Speeches/Meeting Papers | 2 |
Information Analyses | 1 |
Reports - Descriptive | 1 |
Education Level
Higher Education | 11 |
Postsecondary Education | 9 |
Secondary Education | 3 |
Grade 11 | 1 |
High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of English as a Foreign… | 3 |
Graduate Record Examinations | 2 |
Test of Standard Written… | 1 |
What Works Clearinghouse Rating
Pruchnic, Jeff; Barton, Ellen; Primeau, Sarah; Trimble, Thomas; Varty, Nicole; Foster, Tanina – Composition Forum, 2021
Over the past two decades, reflective writing has occupied an increasingly prominent position in composition theory, pedagogy, and assessment as researchers have described the value of reflection and reflective writing in college students' development of higher-order writing skills, such as genre conventions (Yancey, "Reflection";…
Descriptors: Reflection, Correlation, Essays, Freshman Composition
Jiyeo Yun – English Teaching, 2023
Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…
Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring
Ke, Xiaohua; Zeng, Yongqiang; Luo, Haijiao – Journal of Educational Measurement, 2016
This article presents a novel method, the Complex Dynamics Essay Scorer (CDES), for automated essay scoring using complex network features. Texts produced by college students in China were represented as scale-free networks (e.g., a word adjacency model) from which typical network features, such as the in-/out-degrees, clustering coefficient (CC),…
Descriptors: Scoring, Automation, Essays, Networks
Xie, Qin – Language Assessment Quarterly, 2020
This article describes the steps we went through in designing and validating an item bank to diagnose linguistic problems in the English academic writing of university students in Hong Kong. Test items adopt traditional item formats (e.g., MCQ, grammatical judgment tasks, and error correction) but are based on authentic language materials…
Descriptors: English for Academic Purposes, Second Language Learning, Second Language Instruction, Item Analysis
Cohen, Yoav; Levi, Effi; Ben-Simon, Anat – Applied Measurement in Education, 2018
In the current study, two pools of 250 essays, all written as a response to the same prompt, were rated by two groups of raters (14 or 15 raters per group), thereby providing an approximation to the essay's true score. An automated essay scoring (AES) system was trained on the datasets and then scored the essays using a cross-validation scheme. By…
Descriptors: Test Validity, Automation, Scoring, Computer Assisted Testing
Guo, Xiuyan; Lei, Pui-Wa – International Journal of Testing, 2020
Little research has been done on the effects of peer raters' quality characteristics on peer rating qualities. This study aims to address this gap and investigate the effects of key variables related to peer raters' qualities, including content knowledge, previous rating experience, training on rating tasks, and rating motivation. In an experiment…
Descriptors: Peer Evaluation, Error Patterns, Correlation, Knowledge Level
Yun, Jiyeo – ProQuest LLC, 2017
Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…
Descriptors: Interrater Reliability, Essays, Scoring, Evaluators
Steedle, Jeffrey T.; Ferrara, Steve – Applied Measurement in Education, 2016
As an alternative to rubric scoring, comparative judgment generates essay scores by aggregating decisions about the relative quality of the essays. Comparative judgment eliminates certain scorer biases and potentially reduces training requirements, thereby allowing a large number of judges, including teachers, to participate in essay evaluation.…
Descriptors: Essays, Scoring, Comparative Analysis, Evaluators
Morgan, Grant B.; Zhu, Min; Johnson, Robert L.; Hodge, Kari J. – Language Assessment Quarterly, 2014
Common estimators of interrater reliability include Pearson product-moment correlation coefficients, Spearman rank-order correlations, and the generalizability coefficient. The purpose of this study was to examine the accuracy of estimators of interrater reliability when varying the true reliability, number of scale categories, and number of…
Descriptors: Interrater Reliability, Correlation, Generalization, Scoring
Kural, Faruk – Journal of Language and Linguistic Studies, 2018
The present paper, which is a study based on midterm exam results of 53 University English prep-school students, examines correlation between a direct writing test, measured holistically by multiple-trait scoring, and two indirect writing tests used in a competence exam, one of which is a multiple-choice cloze test and the other a rewrite test…
Descriptors: Writing Evaluation, Cloze Procedure, Comparative Analysis, Essays
Skalicky, Stephen; Berger, Cynthia M.; Crossley, Scott A.; McNamara, Danielle S. – Advances in Language and Literary Studies, 2016
A corpus of 313 freshman college essays was analyzed in order to better understand the forms and functions of humor in academic writing. Human ratings of humor and wordplay were statistically aggregated using Factor Analysis to provide an overall "Humor" component score for each essay in the corpus. In addition, the essays were also…
Descriptors: Discourse Analysis, Academic Discourse, Humor, Writing (Composition)
Powers, Donald E.; Escoffery, David S.; Duchnowski, Matthew P. – Applied Measurement in Education, 2015
By far, the most frequently used method of validating (the interpretation and use of) automated essay scores has been to compare them with scores awarded by human raters. Although this practice is questionable, human-machine agreement is still often regarded as the "gold standard." Our objective was to refine this model and apply it to…
Descriptors: Essays, Test Scoring Machines, Program Validation, Criterion Referenced Tests
Forkner, Carl B. – ProQuest LLC, 2013
Compressed time from matriculation to graduation prevalent in executive graduate degree programs means that traditional methods of identifying deficiencies in student writing during the course of study may not provide timely remediation enabling for student success. This study examined a writing-intensive, 10-month executive graduate degree…
Descriptors: Writing Evaluation, Graduate Students, Computer Assisted Testing, Predictive Validity
Liu, Sha; Kunnan, Antony John – CALICO Journal, 2016
This study investigated the application of "WriteToLearn" on Chinese undergraduate English majors' essays in terms of its scoring ability and the accuracy of its error feedback. Participants were 163 second-year English majors from a university located in Sichuan province who wrote 326 essays from two writing prompts. Each paper was…
Descriptors: Foreign Countries, Undergraduate Students, English (Second Language), Second Language Learning
Jarvis, Scott – Language Testing, 2017
The present study discusses the relevance of measures of lexical diversity (LD) to the assessment of learner corpora. It also argues that existing measures of LD, many of which have become specialized for use with language corpora, are fundamentally measures of lexical repetition, are based on an etic perspective of language, and lack construct…
Descriptors: Computational Linguistics, English (Second Language), Second Language Learning, Native Speakers
Previous Page | Next Page ยป
Pages: 1 | 2