NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Location
Laws, Policies, & Programs
What Works Clearinghouse Rating
Showing all 15 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wheeler, Jordan M.; Engelhard, George; Wang, Jue – Measurement: Interdisciplinary Research and Perspectives, 2022
Objectively scoring constructed-response items on educational assessments has long been a challenge due to the use of human raters. Even well-trained raters using a rubric can inaccurately assess essays. Unfolding models measure rater's scoring accuracy by capturing the discrepancy between criterion and operational ratings by placing essays on an…
Descriptors: Accuracy, Scoring, Statistical Analysis, Models
Michael Matta; Milena A. Keller-Margulis; Sterett H. Mercer – Grantee Submission, 2022
Although researchers have investigated technical adequacy and usability of written-expression curriculum-based measures (WE-CBM), the economic implications of different scoring approaches have largely been ignored. The absence of such knowledge can undermine the effective allocation of resources and lead to the adoption of suboptimal measures for…
Descriptors: Cost Effectiveness, Scoring, Automation, Writing Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Keller-Margulis, Milena A.; Mercer, Sterett H.; Matta, Michael – Reading and Writing: An Interdisciplinary Journal, 2021
Existing approaches to measuring writing performance are insufficient in terms of both technical adequacy as well as feasibility for use as a screening measure. This study examined the validity and diagnostic accuracy of several approaches to automated text evaluation as well as written expression curriculum-based measurement (WE-CBM) to determine…
Descriptors: Writing Evaluation, Validity, Automation, Curriculum Based Assessment
Mercer, Sterett H.; Cannon, Joanna E.; Squires, Bonita; Guo, Yue; Pinco, Ella – Canadian Journal of School Psychology, 2021
We examined the extent to which automated written expression curriculum-based measurement (aWE-CBM) can be accurately used to computer score student writing samples for screening and progress monitoring. Students (n = 174) with learning difficulties in Grades 1 to 12 who received 1:1 academic tutoring through a community-based organization…
Descriptors: Curriculum Based Assessment, Automation, Scoring, Writing Tests
Keller-Margulis, Milena A.; Mercer, Sterett H.; Matta, Michael – Grantee Submission, 2021
Existing approaches to measuring writing performance are insufficient in terms of both technical adequacy as well as feasibility for use as a screening measure. This study examined the validity and diagnostic accuracy of several approaches to automated text evaluation as well as written expression curriculum-based measurement (WE-CBM) to determine…
Descriptors: Writing Evaluation, Validity, Automation, Curriculum Based Assessment
Mercer, Sterett H.; Cannon, Joanna E.; Squires, Bonita; Guo, Yue; Pinco, Ella – Grantee Submission, 2021
We examined the extent to which automated written expression curriculum-based measurement (aWE-CBM) can be accurately used to computer score student writing samples for screening and progress monitoring. Students (n = 174) with learning difficulties in Grades 1-12 who received 1:1 academic tutoring through a community-based organization completed…
Descriptors: Curriculum Based Assessment, Automation, Scoring, Writing Tests
Yi Gui – ProQuest LLC, 2024
This study explores using transfer learning in machine learning for natural language processing (NLP) to create generic automated essay scoring (AES) models, providing instant online scoring for statewide writing assessments in K-12 education. The goal is to develop an instant online scorer that is generalizable to any prompt, addressing the…
Descriptors: Writing Tests, Natural Language Processing, Writing Evaluation, Scoring
Wilson, Joshua; Rodrigues, Jessica – Grantee Submission, 2020
The present study leveraged advances in automated essay scoring (AES) technology to explore a proof of concept for a writing screener using the "Project Essay Grade" (PEG) program. First, the study investigated the extent to which an AES-scored multi-prompt writing screener accurately classified students as at risk of failing a Common…
Descriptors: Writing Tests, Screening Tests, Classification, Accuracy
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Yao, Lili; Haberman, Shelby J.; Zhang, Mo – ETS Research Report Series, 2019
Many assessments of writing proficiency that aid in making high-stakes decisions consist of several essay tasks evaluated by a combination of human holistic scores and computer-generated scores for essay features such as the rate of grammatical errors per word. Under typical conditions, a summary writing score is provided by a linear combination…
Descriptors: Prediction, True Scores, Computer Assisted Testing, Scoring
Ricker-Pedley, Kathryn L. – Educational Testing Service, 2011
A pseudo-experimental study was conducted to examine the link between rater accuracy calibration performances and subsequent accuracy during operational scoring. The study asked 45 raters to score a 75-response calibration set and then a 100-response (operational) set of responses from a retired Graduate Record Examinations[R] (GRE[R]) writing…
Descriptors: Scoring, Accuracy, College Entrance Examinations, Writing Tests
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Young-Suk; Al Otaiba, Stephanie; Wanzek, Jeanne; Gatlin, Brandy – Journal of Educational Psychology, 2015
We had 3 aims in the present study: (a) to examine the dimensionality of various evaluative approaches to scoring writing samples (e.g., quality, productivity, and curriculum-based measurement [CBM] writing scoring), (b) to investigate unique language and cognitive predictors of the identified dimensions, and (c) to examine gender gap in the…
Descriptors: Writing (Composition), Gender Differences, Curriculum Based Assessment, Scoring
Park, Yoon Soo – ProQuest LLC, 2011
The use of constructed response (CR) items or performance tasks to assess test takers' ability has grown tremendously over the past decade. Examples of CR items in psychological and educational measurement range from essays, works of art, and admissions interviews. However, unlike multiple-choice (MC) items that have predetermined options, CR…
Descriptors: Responses, Scoring, Item Response Theory, Testing
Kim, Young-Suk; Al Otaiba, Stephanie; Wanzek, Jeanne; Gatlin, Brandy – Grantee Submission, 2015
We had 3 aims in the present study: (a) to examine the dimensionality of various evaluative approaches to scoring writing samples (e.g., quality, productivity, and curriculum-based measurement [CBM] writing scoring), (b) to investigate unique language and cognitive predictors of the identified dimensions, and (c) to examine gender gap in the…
Descriptors: Writing (Composition), Gender Differences, Curriculum Based Assessment, Scoring
Haberman, Shelby J. – Educational Testing Service, 2011
Alternative approaches are discussed for use of e-rater[R] to score the TOEFL iBT[R] Writing test. These approaches involve alternate criteria. In the 1st approach, the predicted variable is the expected rater score of the examinee's 2 essays. In the 2nd approach, the predicted variable is the expected rater score of 2 essay responses by the…
Descriptors: Writing Tests, Scoring, Essays, Language Tests
Peer reviewed Peer reviewed
PDF on ERIC Download full text
DeCarlo, Lawrence T. – ETS Research Report Series, 2008
Rater behavior in essay grading can be viewed as a signal-detection task, in that raters attempt to discriminate between latent classes of essays, with the latent classes being defined by a scoring rubric. The present report examines basic aspects of an approach to constructed-response (CR) scoring via a latent-class signal-detection model. The…
Descriptors: Scoring, Responses, Test Format, Bias