Publication Date
In 2025 | 12 |
Since 2024 | 187 |
Since 2021 (last 5 years) | 818 |
Since 2016 (last 10 years) | 1951 |
Since 2006 (last 20 years) | 4074 |
Descriptor
Item Response Theory | 5553 |
Test Items | 1817 |
Foreign Countries | 1196 |
Models | 1148 |
Psychometrics | 918 |
Scores | 782 |
Comparative Analysis | 761 |
Test Construction | 750 |
Simulation | 740 |
Statistical Analysis | 659 |
Difficulty Level | 570 |
More ▼ |
Source
Author
Sinharay, Sandip | 48 |
Wilson, Mark | 45 |
Cohen, Allan S. | 43 |
Meijer, Rob R. | 43 |
Tindal, Gerald | 42 |
Wang, Wen-Chung | 40 |
Alonzo, Julie | 37 |
Ferrando, Pere J. | 36 |
Cai, Li | 35 |
van der Linden, Wim J. | 35 |
Glas, Cees A. W. | 34 |
More ▼ |
Publication Type
Education Level
Location
Turkey | 94 |
Australia | 89 |
Germany | 79 |
United States | 74 |
Netherlands | 68 |
Taiwan | 59 |
Indonesia | 53 |
China | 51 |
Canada | 49 |
Japan | 38 |
Florida | 37 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 4 |
Meets WWC Standards with or without Reservations | 4 |
Wu, Tong; Kim, Stella Y.; Westine, Carl – Educational and Psychological Measurement, 2023
For large-scale assessments, data are often collected with missing responses. Despite the wide use of item response theory (IRT) in many testing programs, however, the existing literature offers little insight into the effectiveness of various approaches to handling missing responses in the context of scale linking. Scale linking is commonly used…
Descriptors: Data Analysis, Responses, Statistical Analysis, Measurement
Choe, Edison M.; Han, Kyung T. – Journal of Educational Measurement, 2022
In operational testing, item response theory (IRT) models for dichotomous responses are popular for measuring a single latent construct [theta], such as cognitive ability in a content domain. Estimates of [theta], also called IRT scores or [theta hat], can be computed using estimators based on the likelihood function, such as maximum likelihood…
Descriptors: Scores, Item Response Theory, Test Items, Test Format
Boris Forthmann; Benjamin Goecke; Roger E. Beaty – Creativity Research Journal, 2025
Human ratings are ubiquitous in creativity research. Yet, the process of rating responses to creativity tasks -- typically several hundred or thousands of responses, per rater -- is often time-consuming and expensive. Planned missing data designs, where raters only rate a subset of the total number of responses, have been recently proposed as one…
Descriptors: Creativity, Research, Researchers, Research Methodology
Martijn Schoenmakers; Jesper Tijmstra; Jeroen Vermunt; Maria Bolsinova – Educational and Psychological Measurement, 2024
Extreme response style (ERS), the tendency of participants to select extreme item categories regardless of the item content, has frequently been found to decrease the validity of Likert-type questionnaire results. For this reason, various item response theory (IRT) models have been proposed to model ERS and correct for it. Comparisons of these…
Descriptors: Item Response Theory, Response Style (Tests), Models, Likert Scales
Jiangqiong Li – ProQuest LLC, 2024
When measuring latent constructs, for example, language ability, we use statistical models to specify appropriate relationships between the latent construct and observe responses to test items. These models rely on theoretical assumptions to ensure accurate parameter estimates for valid inferences based on the test results. This dissertation…
Descriptors: Goodness of Fit, Item Response Theory, Models, Measurement Techniques
Güler Yavuz Temel – Journal of Educational Measurement, 2024
The purpose of this study was to investigate multidimensional DIF with a simple and nonsimple structure in the context of multidimensional Graded Response Model (MGRM). This study examined and compared the performance of the IRT-LR and Wald test using MML-EM and MHRM estimation approaches with different test factors and test structures in…
Descriptors: Computation, Multidimensional Scaling, Item Response Theory, Models
James Soland – Journal of Research on Educational Effectiveness, 2024
When randomized control trials are not possible, quasi-experimental methods often represent the gold standard. One quasi-experimental method is difference-in-difference (DiD), which compares changes in outcomes before and after treatment across groups to estimate a causal effect. DiD researchers often use fairly exhaustive robustness checks to…
Descriptors: Item Response Theory, Testing, Test Validity, Intervention
Franz Classe; Christoph Kern – Educational and Psychological Measurement, 2024
We develop a "latent variable forest" (LV Forest) algorithm for the estimation of latent variable scores with one or more latent variables. LV Forest estimates unbiased latent variable scores based on "confirmatory factor analysis" (CFA) models with ordinal and/or numerical response variables. Through parametric model…
Descriptors: Algorithms, Item Response Theory, Artificial Intelligence, Factor Analysis
Liang Ye Tan; Stuart McLean; Young Ae Kim; Joseph P. Vitta – Language Testing in Asia, 2024
This study examines how second/foreign language (L2) word difficulty estimates derived from item response theory (IRT) and classical test theory (CTT) frameworks are virtually identical in the context of vocabulary testing. This conclusion is reached via a two-stage process: (a) psychometric assessments of both approaches and (b) L2 word…
Descriptors: Vocabulary, English (Second Language), Test Validity, Second Language Learning
Nianbo Dong; Benjamin Kelcey; Jessaca Spybrook; Yanli Xie; Dung Pham; Peilin Qiu; Ning Sui – Grantee Submission, 2024
Multisite trials that randomize individuals (e.g., students) within sites (e.g., schools) or clusters (e.g., teachers/classrooms) within sites (e.g., schools) are commonly used for program evaluation because they provide opportunities to learn about treatment effects as well as their heterogeneity across sites and subgroups (defined by moderating…
Descriptors: Statistical Analysis, Randomized Controlled Trials, Educational Research, Effect Size
Jiyoun Kim; Chia-Wen Chen; Yi-Jhen Wu – Large-scale Assessments in Education, 2024
Learning strategies have been recognized as important predictors of mathematical achievement. In recent studies, it has been found that Asian students use combined learning strategies, primarily including metacognitive strategies, rather than rote memorization. To the best of the authors' knowledge, there is only one prior study including South…
Descriptors: Achievement Tests, Foreign Countries, Learning Strategies, Mathematics Achievement
Thomas W. Frazier; Andrew J. O. Whitehouse; Susan R. Leekam; Sarah J. Carrington; Gail A. Alvares; David W. Evans; Antonio Y. Hardan; Mirko Uljarevic – Journal of Autism and Developmental Disorders, 2024
Purpose: The aim of the present study was to compare scale and conditional reliability derived from item response theory analyses among the most commonly used, as well as several newly developed, observation, interview, and parent-report autism instruments. Methods: When available, data sets were combined to facilitate large sample evaluation.…
Descriptors: Test Reliability, Item Response Theory, Autism Spectrum Disorders, Clinical Diagnosis
Sijia Huang; Dubravka Svetina Valdivia – Educational and Psychological Measurement, 2024
Identifying items with differential item functioning (DIF) in an assessment is a crucial step for achieving equitable measurement. One critical issue that has not been fully addressed with existing studies is how DIF items can be detected when data are multilevel. In the present study, we introduced a Lord's Wald X[superscript 2] test-based…
Descriptors: Item Analysis, Item Response Theory, Algorithms, Accuracy
Zsuzsa Bakk – Structural Equation Modeling: A Multidisciplinary Journal, 2024
A standard assumption of latent class (LC) analysis is conditional independence, that is the items of the LC are independent of the covariates given the LCs. Several approaches have been proposed for identifying violations of this assumption. The recently proposed likelihood ratio approach is compared to residual statistics (bivariate residuals…
Descriptors: Goodness of Fit, Error of Measurement, Comparative Analysis, Models
Christine E. DeMars; Paulius Satkus – Educational and Psychological Measurement, 2024
Marginal maximum likelihood, a common estimation method for item response theory models, is not inherently a Bayesian procedure. However, due to estimation difficulties, Bayesian priors are often applied to the likelihood when estimating 3PL models, especially with small samples. Little focus has been placed on choosing the priors for marginal…
Descriptors: Item Response Theory, Statistical Distributions, Error of Measurement, Bayesian Statistics