ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	5
Since 2006 (last 20 years)	13

Descriptor

Simulation	13
Item Response Theory	12
Statistical Analysis	6
Computation	5
Goodness of Fit	5
Correlation	4
Comparative Analysis	3
Computer Assisted Testing	3
Models	3
Regression (Statistics)	3
Scores	3
Test Items	3
Testing Problems	3
Achievement Tests	2
Difficulty Level	2
Equations (Mathematics)	2
Error Patterns	2
Hypothesis Testing	2
Methods	2
Probability	2
Sampling	2
Test Format	2
Tests	2
Academic Achievement	1
Adaptive Testing	1
More ▼

Source

Journal of Educational…	4
Journal of Educational and…	3
ETS Research Report Series	2
Applied Measurement in…	1
Applied Psychological…	1
Educational Measurement:…	1
Grantee Submission	1

Author

Sinharay, Sandip	13
Choi, Seung W.	2
Kim, Dong-In	2
Wan, Ping	2
Holland, Paul	1
Johnson, Matthew S.	1
Levy, Roy	1
Lu, Ying	1
Mislevy, Robert J.	1
Steinhauer, Eric W.	1
Sweeney, Sandra M.	1
Whitaker, Mike	1
Zhang, Litong	1
von Davier, Matthias	1
More ▼

Publication Type

Journal Articles	12
Reports - Research	10
Reports - Evaluative	2
Reports - Descriptive	1

Education Level

Audience

Location

Laws, Policies, & Programs

Assessments and Surveys

Indiana Statewide Testing for…

What Works Clearinghouse Rating

Showing all 13 results Save | Export

An Investigation of the Nature and Consequence of the Relationship between IRT Difficulty and Discrimination

Peer reviewed

Direct link

Sweeney, Sandra M.; Sinharay, Sandip; Johnson, Matthew S.; Steinhauer, Eric W. – Educational Measurement: Issues and Practice, 2022

The focus of this paper is on the empirical relationship between item difficulty and item discrimination. Two studies--an empirical investigation and a simulation study--were conducted to examine the association between item difficulty and item discrimination under classical test theory and item response theory (IRT), and the effects of the…

Descriptors: Correlation, Item Response Theory, Item Analysis, Difficulty Level

Extension of Caution Indices to Mixed-Format Tests

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip – Grantee Submission, 2018

Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…

Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns

Assessment of Person Fit Using Resampling-Based Approaches

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2016

De la Torre and Deng suggested a resampling-based approach for person-fit assessment (PFA). The approach involves the use of the [math equation unavailable] statistic, a corrected expected a posteriori estimate of the examinee ability, and the Monte Carlo (MC) resampling method. The Type I error rate of the approach was closer to the nominal level…

Descriptors: Sampling, Research Methodology, Error Patterns, Monte Carlo Methods

Are the Nonparametric Person-Fit Statistics More Powerful than Their Parametric Counterparts? Revisiting the Simulations in Karabatsos (2003)

Peer reviewed

Direct link

Sinharay, Sandip – Applied Measurement in Education, 2017

Karabatsos compared the power of 36 person-fit statistics using receiver operating characteristics curves and found the "H[superscript T]" statistic to be the most powerful in identifying aberrant examinees. He found three statistics, "C", "MCI", and "U3", to be the next most powerful. These four statistics,…

Descriptors: Nonparametric Statistics, Goodness of Fit, Simulation, Comparative Analysis

Person Fit Analysis in Computerized Adaptive Testing Using Tests for a Change Point

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2016

Meijer and van Krimpen-Stoop noted that the number of person-fit statistics (PFSs) that have been designed for computerized adaptive tests (CATs) is relatively modest. This article partially addresses that concern by suggesting three new PFSs for CATs. The statistics are based on tests for a change point and can be used to detect an abrupt change…

Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Goodness of Fit

Assessing Individual-Level Impact of Interruptions during Online Testing

Peer reviewed

Direct link

Sinharay, Sandip; Wan, Ping; Choi, Seung W.; Kim, Dong-In – Journal of Educational Measurement, 2015

With an increase in the number of online tests, the number of interruptions during testing due to unexpected technical issues seems to be on the rise. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. Researchers such as…

Descriptors: Computer Assisted Testing, Testing Problems, Scores, Statistical Analysis

Determining the Overall Impact of Interruptions during Online Testing

Peer reviewed

Direct link

Sinharay, Sandip; Wan, Ping; Whitaker, Mike; Kim, Dong-In; Zhang, Litong; Choi, Seung W. – Journal of Educational Measurement, 2014

With an increase in the number of online tests, interruptions during testing due to unexpected technical issues seem unavoidable. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. There is a lack of research on this…

Descriptors: Computer Assisted Testing, Testing Problems, Scores, Regression (Statistics)

Assessment of Person Fit for Mixed-Format Tests

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2015

Person-fit assessment may help the researcher to obtain additional information regarding the answering behavior of persons. Although several researchers examined person fit, there is a lack of research on person-fit assessment for mixed-format tests. In this article, the lz statistic and the ?2 statistic, both of which have been used for tests…

Descriptors: Test Format, Goodness of Fit, Item Response Theory, Bayesian Statistics

How Often Do Subscores Have Added Value? Results from Operational and Simulated Data

Peer reviewed

Direct link

Sinharay, Sandip – Journal of Educational Measurement, 2010

Recently, there has been an increasing level of interest in subscores for their potential diagnostic value. Haberman suggested a method based on classical test theory to determine whether subscores have added value over total scores. In this article I first provide a rich collection of results regarding when subscores were found to have added…

Descriptors: Scores, Test Theory, Simulation, Reliability

Posterior Predictive Model Checking for Multidimensionality in Item Response Theory

Peer reviewed

Direct link

Levy, Roy; Mislevy, Robert J.; Sinharay, Sandip – Applied Psychological Measurement, 2009

If data exhibit multidimensionality, key conditional independence assumptions of unidimensional models do not hold. The current work pursues posterior predictive model checking, a flexible family of model-checking procedures, as a tool for criticizing models due to unaccounted for dimensions in the context of item response theory. Factors…

Descriptors: Item Response Theory, Models, Methods, Simulation

The Correlation between Item Parameters and Item Fit Statistics. Research Report. ETS RR-07-36

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Lu, Ying – ETS Research Report Series, 2007

Dodeen (2004) studied the correlation between the item parameters of the three-parameter logistic model and two item fit statistics, and found some linear relationships (e.g., a positive correlation between item discrimination parameters and item fit statistics) that have the potential for influencing the work of practitioners who employ item…

Descriptors: Correlation, Test Items, Item Response Theory, Goodness of Fit

An Importance Sampling EM Algorithm for Latent Regression Models

Peer reviewed

Direct link

von Davier, Matthias; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2007

Reporting methods used in large-scale assessments such as the National Assessment of Educational Progress (NAEP) rely on latent regression models. To fit the latent regression model using the maximum likelihood estimation technique, multivariate integrals must be evaluated. In the computer program MGROUP used by the Educational Testing Service for…

Descriptors: Simulation, Computer Software, Sampling, Data Analysis

The Correlation between the Scores of a Test and an Anchor Test. Research Report. ETS RR-06-04

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006

It is a widely held belief that an anchor test used in equating should be a miniature version (or "minitest") of the tests to be equated; that is, the anchor test should be proportionally representative of the two tests in content and statistical characteristics. This paper examines the scientific foundation of this belief, especially…

Descriptors: Test Items, Equated Scores, Correlation, Tests