Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 12 |
Since 2006 (last 20 years) | 34 |
Descriptor
Statistical Analysis | 37 |
Item Response Theory | 16 |
Test Items | 14 |
Scores | 12 |
Goodness of Fit | 11 |
Identification | 9 |
Bayesian Statistics | 8 |
Cheating | 8 |
Models | 7 |
Simulation | 7 |
Comparative Analysis | 6 |
More ▼ |
Source
Author
Sinharay, Sandip | 37 |
Haberman, Shelby J. | 5 |
Johnson, Matthew S. | 5 |
Haberman, Shelby | 3 |
Holland, Paul W. | 3 |
Choi, Seung W. | 2 |
Dorans, Neil J. | 2 |
Feng, Ying | 2 |
Kim, Dong-In | 2 |
Lee, Yi-Hsuan | 2 |
Powers, Donald E. | 2 |
More ▼ |
Publication Type
Reports - Research | 30 |
Journal Articles | 26 |
Reports - Evaluative | 5 |
Numerical/Quantitative Data | 3 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Education | 4 |
Middle Schools | 3 |
Secondary Education | 2 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 8 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Indiana Statewide Testing for… | 2 |
National Assessment of… | 2 |
Pre Professional Skills Tests | 1 |
SAT (College Admission Test) | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
Sinharay, Sandip; Haberman, Shelby J.; Lee, Yi-Hsuan – Journal of Educational Measurement, 2011
Providing information to test takers and test score users about the abilities of test takers at different score levels has been a persistent problem in educational and psychological measurement. Scale anchoring, a technique which describes what students at different points on a score scale know and can do, is a tool to provide such information.…
Descriptors: Scores, Test Items, Statistical Analysis, Licensing Examinations (Professions)
Sinharay, Sandip; Dorans, Neil J. – Journal of Educational and Behavioral Statistics, 2010
The Mantel-Haenszel (MH) procedure (Mantel and Haenszel) is a popular method for estimating and testing a common two-factor association parameter in a 2 x 2 x K table. Holland and Holland and Thayer described how to use the procedure to detect differential item functioning (DIF) for tests with dichotomously scored items. Wang, Bradlow, Wainer, and…
Descriptors: Test Bias, Statistical Analysis, Computation, Bayesian Statistics
Haberman, Shelby J.; Sinharay, Sandip; Lee, Yi-Hsuan – Educational Testing Service, 2011
Providing information to test takers and test score users about the abilities of test takers at different score levels has been a persistent problem in educational and psychological measurement (Carroll, 1993). Scale anchoring (Beaton & Allen, 1992), a technique that describes what students at different points on a score scale know and can do,…
Descriptors: Statistical Analysis, Scores, Regression (Statistics), Item Response Theory
Haberman, Shelby J.; Sinharay, Sandip – Psychometrika, 2010
Recently, there has been increasing interest in reporting subscores. This paper examines reporting of subscores using multidimensional item response theory (MIRT) models (e.g., Reckase in "Appl. Psychol. Meas." 21:25-36, 1997; C.R. Rao and S. Sinharay (Eds), "Handbook of Statistics, vol. 26," pp. 607-642, North-Holland, Amsterdam, 2007; Beguin &…
Descriptors: Item Response Theory, Psychometrics, Statistical Analysis, Scores
Sinharay, Sandip; Haberman, Shelby J.; Jia, Helena – Educational Testing Service, 2011
Standard 3.9 of the "Standards for Educational and Psychological Testing" (American Educational Research Association, American Psychological Association, & National Council for Measurement in Education, 1999) demands evidence of model fit when an item response theory (IRT) model is used to make inferences from a data set. We applied two recently…
Descriptors: Item Response Theory, Goodness of Fit, Statistical Analysis, Language Tests
Levy, Roy; Mislevy, Robert J.; Sinharay, Sandip – Applied Psychological Measurement, 2009
If data exhibit multidimensionality, key conditional independence assumptions of unidimensional models do not hold. The current work pursues posterior predictive model checking, a flexible family of model-checking procedures, as a tool for criticizing models due to unaccounted for dimensions in the context of item response theory. Factors…
Descriptors: Item Response Theory, Models, Methods, Simulation
Sinharay, Sandip; Guo, Zhumei; von Davier, Matthias; Veldkamp, Bernard P. – ETS Research Report Series, 2009
The reporting methods used in large-scale educational assessments such as the National Assessment of Educational Progress (NAEP) rely on a "latent regression model". There is a lack of research on the assessment of fit of latent regression models. This paper suggests a simulation-based model-fit technique to assess the fit of such…
Descriptors: Regression (Statistics), Models, Goodness of Fit, National Competency Tests
von Davier, Matthias; Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2010
This article presents an application of a stochastic approximation expectation maximization (EM) algorithm using a Metropolis-Hastings (MH) sampler to estimate the parameters of an item response latent regression model. Latent regression item response models are extensions of item response theory (IRT) to a latent variable model with covariates…
Descriptors: Item Response Theory, Statistical Analysis, Regression (Statistics), Models
Liu, Jinghua; Sinharay, Sandip; Holland, Paul W.; Feigenbaum, Miriam; Curley, Edward – Educational Testing Service, 2009
This study explores the use of a different type of anchor, a "midi anchor", that has a smaller spread of item difficulties than the tests to be equated, and then contrasts its use with the use of a "mini anchor". The impact of different anchors on observed score equating were evaluated and compared with respect to systematic…
Descriptors: Equated Scores, Test Items, Difficulty Level, Error of Measurement
Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – ETS Research Report Series, 2008
Will reporting subscores provide any additional information than the total score? Is there a method that can be used to provide more trustworthy subscores than observed subscores? These 2 questions are addressed in this study. To answer the 2nd question, 2 subscore estimation methods (i.e., subscore estimated from the observed total score or…
Descriptors: Comparative Analysis, Scores, Tests, Certification
Sinharay, Sandip; Lu, Ying – ETS Research Report Series, 2007
Dodeen (2004) studied the correlation between the item parameters of the three-parameter logistic model and two item fit statistics, and found some linear relationships (e.g., a positive correlation between item discrimination parameters and item fit statistics) that have the potential for influencing the work of practitioners who employ item…
Descriptors: Correlation, Test Items, Item Response Theory, Goodness of Fit
Sinharay, Sandip; Haberman, Shelby; Puhan, Gautam – Educational Measurement: Issues and Practice, 2007
There is an increasing interest in reporting subscores, both at examinee level and at aggregate levels. However, it is important to ensure reasonable subscore performance in terms of high reliability and validity to minimize incorrect instructional and remediation decisions. This article employs a statistical measure based on classical test theory…
Descriptors: Test Reliability, Test Theory, Test Validity, Statistical Analysis
Sinharay, Sandip; Powers, Donald E.; Feng, Ying; Saldivia, Luis; Giunta, Anthony; Simpson, Annabelle; Weng, Vincent – Language Testing, 2009
In order to facilitate the interpretation of test scores from the TOEIC[R] "Bridge" as a measure of English language proficiency, one form of the test was administered to more than 6000 test takers in three South American countries--Colombia, Chile and Ecuador. The appropriateness of the TOEIC "Bridge" test as a measure of…
Descriptors: Factor Analysis, Foreign Countries, Language Skills, English (Second Language)
Haberman, Shelby J.; Holland, Paul W.; Sinharay, Sandip – ETS Research Report Series, 2006
Bounds are established for log cross-product ratios (log odds ratios) involving pairs of items for item response models. First, expressions for bounds on log cross-product ratios are provided for unidimensional item response models in general. Then, explicit bounds are obtained for the Rasch model and the two-parameter logistic (2PL) model.…
Descriptors: Item Response Theory, Models, Goodness of Fit, Item Analysis
Sinharay, Sandip; Holland, Paul – ETS Research Report Series, 2006
It is a widely held belief that anchor tests should be miniature versions (i.e., minitests), with respect to content and statistical characteristics of the tests being equated. This paper examines the foundations for this belief. It examines the requirement of statistical representativeness of anchor tests that are content representative. The…
Descriptors: Test Items, Equated Scores, Evaluation Methods, Difficulty Level