NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 8 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Wang, Jue; Engelhard, George, Jr.; Wolfe, Edward W. – Educational and Psychological Measurement, 2016
The number of performance assessments continues to increase around the world, and it is important to explore new methods for evaluating the quality of ratings obtained from raters. This study describes an unfolding model for examining rater accuracy. Accuracy is defined as the difference between observed and expert ratings. Dichotomous accuracy…
Descriptors: Evaluators, Accuracy, Performance Based Assessment, Models
Peer reviewed Peer reviewed
Direct linkDirect link
Lai, Emily R.; Wolfe, Edward W.; Vickers, Daisy – Educational and Psychological Measurement, 2015
This report summarizes an empirical study that addresses two related topics within the context of writing assessment--illusory halo and how much unique information is provided by multiple analytic scores. Specifically, we address the issue of whether unique information is provided by analytic scores assigned to student writing, beyond what is…
Descriptors: Writing Tests, Scores, Bias, Holistic Approach
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Wolfe, Edward W.; Matthews, Staci; Vickers, Daisy – Journal of Technology, Learning, and Assessment, 2010
This study examined the influence of rater training and scoring context on training time, scoring time, qualifying rate, quality of ratings, and rater perceptions. One hundred twenty raters participated in the study and experienced one of three training contexts: (a) online training in a distributed scoring context, (b) online training in a…
Descriptors: Writing Evaluation, Writing Tests, Qualifications, Program Effectiveness
Chiu, Chris W. T.; Wolfe, Edward W. – 1997
Unstable, and potentially invalid, variance component estimates may result from using only a limited portion of available data from operational performance assessments. However, missing observations are common in these settings because of the nature of the assessment design. This paper describes a procedure for overcoming the computational and…
Descriptors: College Students, Data Analysis, Essay Tests, Generalizability Theory
Wolfe, Edward W.; Kao, Chi-Wen – 1996
The amount of variability contributed to large-scale performance assessment scores by raters is a constant concern for those who wish to use results from these assessments for educational decisions. This study approaches the problem by examining the behaviors of essay scorers who demonstrate different levels of proficiency with a holistic scoring…
Descriptors: Essay Tests, Experience, Holistic Approach, Judges
Wolfe, Edward W.; Kao, Chi-Wen – 1996
This paper reports the results of an analysis of the relationship between scorer behaviors and score variability. Thirty-six essay scorers were interviewed and asked to perform a think-aloud task as they scored 24 essays. Each comment made by a scorer was coded according to its content focus (i.e. appearance, assignment, mechanics, communication,…
Descriptors: Content Analysis, Educational Assessment, Essays, Evaluation Methods
Wolfe, Edward W.; Feltovich, Brian – 1994
This paper presents a model of scored cognition that incorporates two types of mental models: models of performance (i.e., the criteria for judging performance) and models of scoring (i.e., the procedural scripts for scoring an essay). In Study 1, six novice and five experienced scorers wrote definitions of three levels of a 6-point holistic…
Descriptors: Cognitive Processes, Criteria, Essays, Evaluation Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Wolfe, Edward W.; Manalo, Jonathan R. – Language Learning & Technology, 2004
The Test of English as a Foreign Language (TOEFL) contains a direct writing assessment, and examinees are given the option of composing their responses at a computer terminal using a keyboard or composing their responses in handwriting. This study sought to determine whether performance on a direct writing assessment is comparable for examinees…
Descriptors: Writing Evaluation, Handwriting, Writing Tests, Computer Terminals