Publication Date
In 2025 | 0 |
Since 2024 | 5 |
Since 2021 (last 5 years) | 19 |
Since 2016 (last 10 years) | 42 |
Since 2006 (last 20 years) | 64 |
Descriptor
Evaluation Methods | 64 |
Measurement | 14 |
Educational Assessment | 13 |
Student Evaluation | 13 |
Cutting Scores | 9 |
Item Response Theory | 9 |
Test Construction | 9 |
Test Items | 9 |
Evaluation Criteria | 8 |
Models | 8 |
Validity | 8 |
More ▼ |
Source
Educational Measurement:… | 64 |
Author
Wyse, Adam E. | 4 |
Sireci, Stephen G. | 3 |
Wind, Stefanie A. | 3 |
Babcock, Ben | 2 |
Baldwin, Peter | 2 |
Herman, Joan | 2 |
Penfield, Randall D. | 2 |
Reckase, Mark D. | 2 |
Alina A. von Davier | 1 |
Ames, Allison J. | 1 |
Amy K. Clark | 1 |
More ▼ |
Publication Type
Journal Articles | 64 |
Reports - Descriptive | 23 |
Reports - Research | 22 |
Reports - Evaluative | 17 |
Information Analyses | 2 |
Opinion Papers | 2 |
Speeches/Meeting Papers | 2 |
Education Level
Audience
Location
California | 1 |
China | 1 |
Florida | 1 |
Idaho | 1 |
New Hampshire | 1 |
USSR | 1 |
Washington | 1 |
Wisconsin | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Race to the Top | 1 |
Assessments and Surveys
Program for International… | 1 |
Progress in International… | 1 |
Trends in International… | 1 |
What Works Clearinghouse Rating
Buzick, Heather M.; Casabianca, Jodi M.; Gholson, Melissa L. – Educational Measurement: Issues and Practice, 2023
The article describes practical suggestions for measurement researchers and psychometricians to respond to calls for social responsibility in assessment. The underlying assumption is that personalizing large-scale assessment improves the chances that assessment and the use of test scores will contribute to equity in education. This article…
Descriptors: Achievement Tests, Individualized Instruction, Evaluation Methods, Equal Education
W. Jake Thompson; Amy K. Clark – Educational Measurement: Issues and Practice, 2024
In recent years, educators, administrators, policymakers, and measurement experts have called for assessments that support educators in making better instructional decisions. One promising approach to measurement to support instructional decision-making is diagnostic classification models (DCMs). DCMs are flexible psychometric models that…
Descriptors: Decision Making, Instructional Improvement, Evaluation Methods, Models
Kim, Stella Y. – Educational Measurement: Issues and Practice, 2022
In this digital ITEMS module, Dr. Stella Kim provides an overview of multidimensional item response theory (MIRT) equating. Traditional unidimensional item response theory (IRT) equating methods impose the sometimes untenable restriction on data that only a single ability is assessed. This module discusses potential sources of multidimensionality…
Descriptors: Item Response Theory, Models, Equated Scores, Evaluation Methods
Jiangang Hao; Alina A. von Davier; Victoria Yaneva; Susan Lottridge; Matthias von Davier; Deborah J. Harris – Educational Measurement: Issues and Practice, 2024
The remarkable strides in artificial intelligence (AI), exemplified by ChatGPT, have unveiled a wealth of opportunities and challenges in assessment. Applying cutting-edge large language models (LLMs) and generative AI to assessment holds great promise in boosting efficiency, mitigating bias, and facilitating customized evaluations. Conversely,…
Descriptors: Evaluation Methods, Artificial Intelligence, Educational Change, Computer Software
Bennett, Randy E. – Educational Measurement: Issues and Practice, 2022
This commentary focuses on one of the positive impacts of COVID-19, which was to tie societal inequity to testing in a manner that could motivate the reimagining of our field. That reimagining needs to account for our nation's dramatically changing demographics so that assessment generally, and standardized testing specifically, better fit the…
Descriptors: COVID-19, Pandemics, Social Justice, Testing
Peabody, Michael R.; Muckle, Timothy J.; Meng, Yu – Educational Measurement: Issues and Practice, 2023
The subjective aspect of standard-setting is often criticized, yet data-driven standard-setting methods are rarely applied. Therefore, we applied a mixture Rasch model approach to setting performance standards across several testing programs of various sizes and compared the results to existing passing standards derived from traditional…
Descriptors: Item Response Theory, Standard Setting, Testing, Sampling
Lewis, Jennifer; Lim, Hwanggyu; Padellaro, Frank; Sireci, Stephen G.; Zenisky, April L. – Educational Measurement: Issues and Practice, 2022
Setting cut scores on (MSTs) is difficult, particularly when the test spans several grade levels, and the selection of items from MST panels must reflect the operational test specifications. In this study, we describe, illustrate, and evaluate three methods for mapping panelists' Angoff ratings into cut scores on the scale underlying an MST. The…
Descriptors: Cutting Scores, Adaptive Testing, Test Items, Item Analysis
Baldwin, Peter – Educational Measurement: Issues and Practice, 2021
In the Bookmark standard-setting procedure, panelists are instructed to consider what examinees know rather than what they might attain by guessing; however, because examinees sometimes do guess, the procedure includes a correction for guessing. Like other corrections for guessing, the Bookmark's correction assumes that examinees either know the…
Descriptors: Guessing (Tests), Student Evaluation, Evaluation Methods, Standard Setting (Scoring)
Tong, Ye – Educational Measurement: Issues and Practice, 2022
COVID-19 is disrupting assessment practices and accelerating changes. With special focus on K-12 and credentialing exams, this article describes the series of changes observed during the pandemic, the solutions assessment providers have implemented, and the long-term impact on future practices. Additionally, this article highlights the importance…
Descriptors: COVID-19, Pandemics, Elementary Secondary Education, Evaluation Methods
Middleton, Kyndra V. – Educational Measurement: Issues and Practice, 2022
The onset of the coronavirus pandemic forced schools and universities across the nation and world to close and move to distance learning rather immediately. Almost two years later, colleges and universities have reopened, and most students have returned to campuses, but distance learning still occurs at a much higher rate than before the beginning…
Descriptors: Computer Assisted Testing, Internet, Student Evaluation, College Students
Ji, Xuejun Ryan; Wu, Amery D. – Educational Measurement: Issues and Practice, 2023
The Cross-Classified Mixed Effects Model (CCMEM) has been demonstrated to be a flexible framework for evaluating reliability by measurement specialists. Reliability can be estimated based on the variance components of the test scores. Built upon their accomplishment, this study extends the CCMEM to be used for evaluating validity evidence.…
Descriptors: Measurement, Validity, Reliability, Models
Baron, Patricia; Sireci, Stephen G.; Slater, Sharon C. – Educational Measurement: Issues and Practice, 2021
Since the No Child Left Behind Act (No Child Left Behind [NCLB], 2001) was enacted, the Bookmark method has been used in many state standard setting studies (Karantonis and Sireci; Zieky, Perie, and Livingston). The purpose of the current study is to evaluate the criticism that when panelists are presented with data during the Bookmark standard…
Descriptors: State Standards, Standard Setting, Evaluators, Training
Johnson, Evelyn S.; Crawford, Angela R.; Zheng, Yuzhu; Moylan, Laura A. – Educational Measurement: Issues and Practice, 2021
In this study, we compared the results of 27 special education teachers' evaluations using two different observation instruments, the Framework for Teaching (FFT), and the Explicit Instruction observation protocol of the Recognizing Effective Special Education Teachers (RESET) observation system. Results indicate differences in the rank-ordering…
Descriptors: Special Education Teachers, Teacher Evaluation, Teacher Effectiveness, Evaluation Methods
Baldwin, Peter; Margolis, Melissa J.; Clauser, Brian E.; Mee, Janet; Winward, Marcia – Educational Measurement: Issues and Practice, 2020
Evidence of the internal consistency of standard-setting judgments is a critical part of the validity argument for tests used to make classification decisions. The bookmark standard-setting procedure is a popular approach to establishing performance standards, but there is relatively little research that reflects on the internal consistency of the…
Descriptors: Standard Setting (Scoring), Probability, Cutting Scores, Evaluation Methods
Wyse, Adam E. – Educational Measurement: Issues and Practice, 2020
One commonly used compromise standard-setting method is the Beuk (1984) method. A key assumption of the Beuk method is that the emphasis given to the pass rate and the percent correct ratings should be proportional to the extent that the panelists agree on their ratings. However, whether the slope of Beuk line reflects the emphasis that panelists…
Descriptors: Standard Setting (Scoring), Cutting Scores, Weighted Scores, Evaluation Methods