Publication Date
In 2025 | 0 |
Since 2024 | 10 |
Since 2021 (last 5 years) | 36 |
Since 2016 (last 10 years) | 87 |
Since 2006 (last 20 years) | 172 |
Descriptor
Item Response Theory | 264 |
Test Items | 104 |
Models | 85 |
Simulation | 76 |
Comparative Analysis | 49 |
Scores | 40 |
Computation | 34 |
Computer Assisted Testing | 30 |
Error of Measurement | 30 |
Evaluation Methods | 30 |
Difficulty Level | 28 |
More ▼ |
Source
Journal of Educational… | 264 |
Author
Lee, Won-Chan | 10 |
Sinharay, Sandip | 8 |
van der Linden, Wim J. | 7 |
Cohen, Allan S. | 5 |
Kolen, Michael J. | 5 |
Wang, Wen-Chung | 5 |
Wilson, Mark | 5 |
Dorans, Neil J. | 4 |
Gierl, Mark J. | 4 |
Jiao, Hong | 4 |
Mislevy, Robert J. | 4 |
More ▼ |
Publication Type
Journal Articles | 264 |
Reports - Research | 142 |
Reports - Evaluative | 89 |
Reports - Descriptive | 29 |
Speeches/Meeting Papers | 12 |
Book/Product Reviews | 4 |
Information Analyses | 1 |
Opinion Papers | 1 |
Education Level
Secondary Education | 13 |
Higher Education | 6 |
Middle Schools | 5 |
Elementary Education | 4 |
Elementary Secondary Education | 4 |
Postsecondary Education | 4 |
High Schools | 2 |
Junior High Schools | 2 |
Grade 7 | 1 |
Grade 8 | 1 |
Audience
Researchers | 1 |
Teachers | 1 |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Junhuan Wei; Qin Wang; Buyun Dai; Yan Cai; Dongbo Tu – Journal of Educational Measurement, 2024
Traditional IRT and IRTree models are not appropriate for analyzing the item that simultaneously consists of multiple-choice (MC) task and constructed-response (CR) task in one item. To address this issue, this study proposed an item response tree model (called as IRTree-MR) to accommodate items that contain different response types at different…
Descriptors: Item Response Theory, Models, Multiple Choice Tests, Cognitive Processes
Jianbin Fu; Xuan Tan; Patrick C. Kyllonen – Journal of Educational Measurement, 2024
This paper presents the item and test information functions of the Rank two-parameter logistic models (Rank-2PLM) for items with two (pair) and three (triplet) statements in forced-choice questionnaires. The Rank-2PLM model for pairs is the MUPP-2PLM (Multi-Unidimensional Pairwise Preference) and, for triplets, is the Triplet-2PLM. Fisher's…
Descriptors: Questionnaires, Test Items, Item Response Theory, Models
Sooyong Lee; Suhwa Han; Seung W. Choi – Journal of Educational Measurement, 2024
Research has shown that multiple-indicator multiple-cause (MIMIC) models can result in inflated Type I error rates in detecting differential item functioning (DIF) when the assumption of equal latent variance is violated. This study explains how the violation of the equal variance assumption adversely impacts the detection of nonuniform DIF and…
Descriptors: Factor Analysis, Bayesian Statistics, Test Bias, Item Response Theory
Tao Gong; Lan Shuai; Robert J. Mislevy – Journal of Educational Measurement, 2024
The usual interpretation of the person and task variables in between-persons measurement models such as item response theory (IRT) is as attributes of persons and tasks, respectively. They can be viewed instead as ensemble descriptors of patterns of interactions among persons and situations that arise from sociocognitive complex adaptive system…
Descriptors: Cognitive Processes, Item Response Theory, Social Cognition, Individualized Instruction
Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024
This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…
Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction
Yamaguchi, Kazuhiro; Zhang, Jihong – Journal of Educational Measurement, 2023
This study proposed Gibbs sampling algorithms for variable selection in a latent regression model under a unidimensional two-parameter logistic item response theory model. Three types of shrinkage priors were employed to obtain shrinkage estimates: double-exponential (i.e., Laplace), horseshoe, and horseshoe+ priors. These shrinkage priors were…
Descriptors: Algorithms, Simulation, Mathematics Achievement, Bayesian Statistics
Chalmers, R. Philip – Journal of Educational Measurement, 2023
Several marginal effect size (ES) statistics suitable for quantifying the magnitude of differential item functioning (DIF) have been proposed in the area of item response theory; for instance, the Differential Functioning of Items and Tests (DFIT) statistics, signed and unsigned item difference in the sample statistics (SIDS, UIDS, NSIDS, and…
Descriptors: Test Bias, Item Response Theory, Definitions, Monte Carlo Methods
Choe, Edison M.; Han, Kyung T. – Journal of Educational Measurement, 2022
In operational testing, item response theory (IRT) models for dichotomous responses are popular for measuring a single latent construct [theta], such as cognitive ability in a content domain. Estimates of [theta], also called IRT scores or [theta hat], can be computed using estimators based on the likelihood function, such as maximum likelihood…
Descriptors: Scores, Item Response Theory, Test Items, Test Format
Güler Yavuz Temel – Journal of Educational Measurement, 2024
The purpose of this study was to investigate multidimensional DIF with a simple and nonsimple structure in the context of multidimensional Graded Response Model (MGRM). This study examined and compared the performance of the IRT-LR and Wald test using MML-EM and MHRM estimation approaches with different test factors and test structures in…
Descriptors: Computation, Multidimensional Scaling, Item Response Theory, Models
Park, Seohee; Kim, Kyung Yong; Lee, Won-Chan – Journal of Educational Measurement, 2023
Multiple measures, such as multiple content domains or multiple types of performance, are used in various testing programs to classify examinees for screening or selection. Despite the popular usages of multiple measures, there is little research on classification consistency and accuracy of multiple measures. Accordingly, this study introduces an…
Descriptors: Testing, Computation, Classification, Accuracy
Kim, Stella Y.; Lee, Won-Chan – Journal of Educational Measurement, 2023
The current study proposed several variants of simple-structure multidimensional item response theory equating procedures. Four distinct sets of data were used to demonstrate feasibility of proposed equating methods for two different equating designs: a random groups design and a common-item nonequivalent groups design. Findings indicated some…
Descriptors: Item Response Theory, Equated Scores, Monte Carlo Methods, Research Methodology
Strachan, Tyler; Cho, Uk Hyun; Kim, Kyung Yong; Willse, John T.; Chen, Shyh-Huei; Ip, Edward H.; Ackerman, Terry A.; Weeks, Jonathan P. – Journal of Educational Measurement, 2021
In vertical scaling, results of tests from several different grade levels are placed on a common scale. Most vertical scaling methodologies rely heavily on the assumption that the construct being measured is unidimensional. In many testing situations, however, such an assumption could be problematic. For instance, the construct measured at one…
Descriptors: Item Response Theory, Scaling, Tests, Construct Validity
Combs, Adam – Journal of Educational Measurement, 2023
A common method of checking person-fit in Bayesian item response theory (IRT) is the posterior-predictive (PP) method. In recent years, more powerful approaches have been proposed that are based on resampling methods using the popular L*[subscript z] statistic. There has also been proposed a new Bayesian model checking method based on pivotal…
Descriptors: Bayesian Statistics, Goodness of Fit, Evaluation Methods, Monte Carlo Methods
He, Yinhong – Journal of Educational Measurement, 2023
Back random responding (BRR) behavior is one of the commonly observed careless response behaviors. Accurately detecting BRR behavior can improve test validities. Yu and Cheng (2019) showed that the change point analysis (CPA) procedure based on weighted residual (CPA-WR) performed well in detecting BRR. Compared with the CPA procedure, the…
Descriptors: Test Validity, Item Response Theory, Measurement, Monte Carlo Methods
van der Linden, Wim J.; Belov, Dmitry I. – Journal of Educational Measurement, 2023
A test of item compromise is presented which combines the test takers' responses and response times (RTs) into a statistic defined as the number of correct responses on the item for test takers with RTs flagged as suspicious. The test has null and alternative distributions belonging to the well-known family of compound binomial distributions, is…
Descriptors: Item Response Theory, Reaction Time, Test Items, Item Analysis