NotesFAQContact Us
Collection
Advanced
Search Tips
Showing all 13 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Sweeney, Sandra M.; Sinharay, Sandip; Johnson, Matthew S.; Steinhauer, Eric W. – Educational Measurement: Issues and Practice, 2022
The focus of this paper is on the empirical relationship between item difficulty and item discrimination. Two studies--an empirical investigation and a simulation study--were conducted to examine the association between item difficulty and item discrimination under classical test theory and item response theory (IRT), and the effects of the…
Descriptors: Correlation, Item Response Theory, Item Analysis, Difficulty Level
Sinharay, Sandip – Grantee Submission, 2018
Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational Measurement, 2016
De la Torre and Deng suggested a resampling-based approach for person-fit assessment (PFA). The approach involves the use of the [math equation unavailable] statistic, a corrected expected a posteriori estimate of the examinee ability, and the Monte Carlo (MC) resampling method. The Type I error rate of the approach was closer to the nominal level…
Descriptors: Sampling, Research Methodology, Error Patterns, Monte Carlo Methods
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Applied Measurement in Education, 2017
Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…
Descriptors: Nonparametric Statistics, Goodness of Fit, Simulation, Comparative Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2016
Meijer and van Krimpen-Stoop noted that the number of person-fit statistics (PFSs) that have been designed for computerized adaptive tests (CATs) is relatively modest. This article partially addresses that concern by suggesting three new PFSs for CATs. The statistics are based on tests for a change point and can be used to detect an abrupt change…
Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Wan, Ping; Choi, Seung W.; Kim, Dong-In – Journal of Educational Measurement, 2015
With an increase in the number of online tests, the number of interruptions during testing due to unexpected technical issues seems to be on the rise. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. Researchers such as…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Statistical Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip; Wan, Ping; Whitaker, Mike; Kim, Dong-In; Zhang, Litong; Choi, Seung W. – Journal of Educational Measurement, 2014
With an increase in the number of online tests, interruptions during testing due to unexpected technical issues seem unavoidable. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. There is a lack of research on this…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Regression (Statistics)
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2015
Person-fit assessment may help the researcher to obtain additional information regarding the answering behavior of persons. Although several researchers examined person fit, there is a lack of research on person-fit assessment for mixed-format tests. In this article, the lz statistic and the ?2 statistic, both of which have been used for tests…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Bayesian Statistics
Peer reviewed Peer reviewed
Direct linkDirect link
Sinharay, Sandip – Journal of Educational Measurement, 2010
Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman suggested a method based on classical test theory to determine whether subscores have added value over total scores. In this article I first provide a rich collection of results regarding when subscores were found to have added…
Descriptors: Scores, Test Theory, Simulation, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Levy, Roy; Mislevy, Robert J.; Sinharay, Sandip – Applied Psychological Measurement, 2009
If data exhibit multidimensionality, key conditional independence assumptions of unidimensional models do not hold. The current work pursues posterior predictive model checking, a flexible family of model-checking procedures, as a tool for criticizing models due to unaccounted for dimensions in the context of item response theory. Factors…
Descriptors: Item Response Theory, Models, Methods, Simulation
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sinharay, Sandip; Lu, Ying – ETS Research Report Series, 2007
Dodeen (2004) studied the correlation between the item parameters of the three-parameter logistic model and two item fit statistics, and found some linear relationships (e.g., a positive correlation between item discrimination parameters and item fit statistics) that have the potential for influencing the work of practitioners who employ item…
Descriptors: Correlation, Test Items, Item Response Theory, Goodness of Fit
Peer reviewed Peer reviewed
Direct linkDirect link
von Davier, Matthias; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2007
Reporting methods used in large-scale assessments such as the National Assessment of Educational Progress (NAEP) rely on latent regression models. To fit the latent regression model using the maximum likelihood estimation technique, multivariate integrals must be evaluated. In the computer program MGROUP used by the Educational Testing Service for…
Descriptors: Simulation, Computer Software, Sampling, Data Analysis
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006
It is a widely held belief that an anchor test used in equating should be a miniature version (or "minitest") of the tests to be equated; that is, the anchor test should be proportionally representative of the two tests in content and statistical characteristics. This paper examines the scientific foundation of this belief, especially…
Descriptors: Test Items, Equated Scores, Correlation, Tests