Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 10 |
Since 2006 (last 20 years) | 24 |
Descriptor
Item Response Theory | 33 |
Models | 11 |
Measurement | 8 |
Psychometrics | 8 |
Writing Evaluation | 6 |
Writing Tests | 6 |
Difficulty Level | 5 |
Evaluation Methods | 5 |
Goodness of Fit | 5 |
Measurement Techniques | 5 |
Educational Assessment | 4 |
More ▼ |
Source
Author
Engelhard, George, Jr. | 33 |
Wang, Jue | 5 |
Wind, Stefanie A. | 5 |
Garner, Mary | 3 |
Randall, Jennifer | 2 |
Walker, A. Adrienne | 2 |
Wolfe, Edward W. | 2 |
Behizadeh, Nadia | 1 |
Chang, Mei-Lin | 1 |
Cheong, Yuk Fai | 1 |
Foltz, Peter | 1 |
More ▼ |
Publication Type
Education Level
Audience
Teachers | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Advanced Placement… | 2 |
Georgia Criterion Referenced… | 2 |
SAT (College Admission Test) | 1 |
What Works Clearinghouse Rating
Turner, Kyle T.; Engelhard, George, Jr. – Measurement: Interdisciplinary Research and Perspectives, 2023
The purpose of this study is to illustrate the use of functional data analysis (FDA) as a general methodology for analyzing person response functions (PRFs). Applications of FDA to psychometrics have included the estimation of item response functions and latent distributions, as well as differential item functioning. Although FDA has been…
Descriptors: Data Analysis, Item Response Theory, Psychometrics, Statistical Distributions
Wang, Jue; Engelhard, George, Jr. – Journal of Educational Measurement, 2019
Rater-mediated assessments exhibit scoring challenges due to the involvement of human raters. The quality of human ratings largely determines the reliability, validity, and fairness of the assessment process. Our research recommends that the evaluation of ratings should be based on two aspects: a theoretical model of human judgment and an…
Descriptors: Evaluative Thinking, Models, Measurement, Achievement
Walker, A. Adrienne; Jennings, Jeremy Kyle; Engelhard, George, Jr. – Educational Assessment, 2018
Individual person fit analyses provide important information regarding the validity of test score inferences for an "individual" test taker. In this study, we use data from an undergraduate statistics test (N = 1135) to illustrate a two-step method that researchers and practitioners can use to examine individual person fit. First, person…
Descriptors: Test Items, Test Validity, Scores, Statistics
Wang, Jue; Tanaka, Victoria T.; Engelhard, George, Jr.; Rabbitt, Matthew P. – Measurement: Interdisciplinary Research and Perspectives, 2020
In this study, we proposed a multilevel explanatory Rasch model for examining measurement invariance of international food-insecurity measures on the Food Insecurity Experience (FIE) Scale from different countries. The United Nations Food and Agriculture Organization developed the FIE Scale to quantitatively measure the severity of food…
Descriptors: Item Response Theory, Measures (Individuals), Food, Security (Psychology)
Wind, Stefanie A.; Wolfe, Edward W.; Engelhard, George, Jr.; Foltz, Peter; Rosenstein, Mark – International Journal of Testing, 2018
Automated essay scoring engines (AESEs) are becoming increasingly popular as an efficient method for performance assessments in writing, including many language assessments that are used worldwide. Before they can be used operationally, AESEs must be "trained" using machine-learning techniques that incorporate human ratings. However, the…
Descriptors: Computer Assisted Testing, Essay Tests, Writing Evaluation, Scoring
Wang, Jue; Engelhard, George, Jr. – Educational Measurement: Issues and Practice, 2019
In this digital ITEMS module, Dr. Jue Wang and Dr. George Engelhard Jr. describe the Rasch measurement framework for the construction and evaluation of new measures and scales. From a theoretical perspective, they discuss the historical and philosophical perspectives on measurement with a focus on Rasch's concept of specific objectivity and…
Descriptors: Item Response Theory, Evaluation Methods, Measurement, Goodness of Fit
Engelhard, George, Jr.; Perkins, Aminah – Measurement: Interdisciplinary Research and Perspectives, 2013
In this commentary, Englehard and Perkins remark that Maydeu-Olivares has presented a framework for evaluating the goodness of model-data fit for item response theory (IRT) models and correctly points out that overall goodness-of-fit evaluations of IRT models and data are not generally explored within most applications in educational and…
Descriptors: Goodness of Fit, Item Response Theory, Models, Measurement
Wind, Stefanie A.; Engelhard, George, Jr. – Educational and Psychological Measurement, 2016
Mokken scale analysis is a probabilistic nonparametric approach that offers statistical and graphical tools for evaluating the quality of social science measurement without placing potentially inappropriate restrictions on the structure of a data set. In particular, Mokken scaling provides a useful method for evaluating important measurement…
Descriptors: Nonparametric Statistics, Statistical Analysis, Measurement, Psychometrics
Wang, Jue; Engelhard, George, Jr.; Wolfe, Edward W. – Educational and Psychological Measurement, 2016
The number of performance assessments continues to increase around the world, and it is important to explore new methods for evaluating the quality of ratings obtained from raters. This study describes an unfolding model for examining rater accuracy. Accuracy is defined as the difference between observed and expert ratings. Dichotomous accuracy…
Descriptors: Evaluators, Accuracy, Performance Based Assessment, Models
Engelhard, George, Jr.; Wang, Jue – Measurement: Interdisciplinary Research and Perspectives, 2014
The authors of the Focus article pose important questions regarding whether or not performance-based tasks related to executive functioning are best viewed as reflective or formative indicators. Miyake and Friedman (2012) define executive functioning (EF) as "a set of general-purpose control mechanisms, often linked to the prefrontal cortex…
Descriptors: Executive Function, Cognitive Measurement, Structural Equation Models, Item Response Theory
Walker, A. Adrienne; Engelhard, George, Jr. – Applied Measurement in Education, 2015
The idea that test scores may not be valid representations of what students know, can do, and should learn next is well known. Person fit provides an important aspect of validity evidence. Person fit analyses at the individual student level are not typically conducted and person fit information is not communicated to educational stakeholders. In…
Descriptors: Test Validity, Goodness of Fit, Educational Assessment, Hierarchical Linear Modeling
Wind, Stefanie A.; Engelhard, George, Jr.; Wesolowski, Brian – Educational Assessment, 2016
When good model-data fit is observed, the Many-Facet Rasch (MFR) model acts as a linking and equating model that can be used to estimate student achievement, item difficulties, and rater severity on the same linear continuum. Given sufficient connectivity among the facets, the MFR model provides estimates of student achievement that are equated to…
Descriptors: Evaluators, Interrater Reliability, Academic Achievement, Music Education
Behizadeh, Nadia; Engelhard, George, Jr. – Assessing Writing, 2011
The purpose of this study is to examine the interactions among measurement theories, writing theories, and writing assessments in the United States from an historical perspective. The assessment of writing provides a useful framework for examining how theories influence, and in some cases fail to influence actual practice. Two research traditions…
Descriptors: Writing (Composition), Intellectual Disciplines, Writing Evaluation, Writing Tests
Chang, Mei-Lin; Engelhard, George, Jr. – Journal of Psychoeducational Assessment, 2016
The purpose of this study is to examine the psychometric quality of the Teachers' Sense of Efficacy Scale (TSES) with data collected from 554 teachers in a U.S. Midwestern state. The many-facet Rasch model was used to examine several potential contextual influences (years of teaching experience, school context, and levels of emotional exhaustion)…
Descriptors: Models, Teacher Attitudes, Self Efficacy, Item Response Theory
Kaliski, Pamela; Wind, Stefanie A.; Engelhard, George, Jr.; Morgan, Deanna; Plake, Barbara; Reshetar, Rosemary – College Board, 2012
The Many-Facet Rasch (MFR) Model is traditionally used to evaluate the quality of ratings on constructed response assessments; however, it can also be used to evaluate the quality of judgments from panel-based standard setting procedures. The current study illustrates the use of the MFR Model by examining the quality of ratings obtained from a…
Descriptors: Advanced Placement Programs, Achievement Tests, Item Response Theory, Models