Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 12 |
Since 2006 (last 20 years) | 34 |
Descriptor
Statistical Analysis | 37 |
Item Response Theory | 16 |
Test Items | 14 |
Scores | 12 |
Goodness of Fit | 11 |
Identification | 9 |
Bayesian Statistics | 8 |
Cheating | 8 |
Models | 7 |
Simulation | 7 |
Comparative Analysis | 6 |
More ▼ |
Source
Author
Sinharay, Sandip | 37 |
Haberman, Shelby J. | 5 |
Johnson, Matthew S. | 5 |
Haberman, Shelby | 3 |
Holland, Paul W. | 3 |
Choi, Seung W. | 2 |
Dorans, Neil J. | 2 |
Feng, Ying | 2 |
Kim, Dong-In | 2 |
Lee, Yi-Hsuan | 2 |
Powers, Donald E. | 2 |
More ▼ |
Publication Type
Reports - Research | 30 |
Journal Articles | 26 |
Reports - Evaluative | 5 |
Numerical/Quantitative Data | 3 |
Opinion Papers | 1 |
Reports - Descriptive | 1 |
Tests/Questionnaires | 1 |
Education Level
Elementary Education | 4 |
Middle Schools | 3 |
Secondary Education | 2 |
Elementary Secondary Education | 1 |
Grade 4 | 1 |
Grade 5 | 1 |
Grade 8 | 1 |
High Schools | 1 |
Intermediate Grades | 1 |
Junior High Schools | 1 |
Audience
Laws, Policies, & Programs
Assessments and Surveys
Indiana Statewide Testing for… | 2 |
National Assessment of… | 2 |
Pre Professional Skills Tests | 1 |
SAT (College Admission Test) | 1 |
Test of English for… | 1 |
What Works Clearinghouse Rating
The Use of Theory of Linear Mixed-Effects Models to Detect Fraudulent Erasures at an Aggregate Level
Peng, Luyao; Sinharay, Sandip – Educational and Psychological Measurement, 2022
Wollack et al. (2015) suggested the erasure detection index (EDI) for detecting fraudulent erasures for individual examinees. Wollack and Eckerly (2017) and Sinharay (2018) extended the index of Wollack et al. (2015) to suggest three EDIs for detecting fraudulent erasures at the aggregate or group level. This article follows up on the research of…
Descriptors: Cheating, Identification, Statistical Analysis, Testing
Sinharay, Sandip; Johnson, Matthew S. – Journal of Educational and Behavioral Statistics, 2021
Score differencing is one of the six categories of statistical methods used to detect test fraud (Wollack & Schoenig, 2018) and involves the testing of the null hypothesis that the performance of an examinee is similar over two item sets versus the alternative hypothesis that the performance is better on one of the item sets. We suggest, to…
Descriptors: Probability, Bayesian Statistics, Cheating, Statistical Analysis
Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2021
Score differencing is one of six categories of statistical methods used to detect test fraud (Wollack & Schoenig, 2018) and involves the testing of the null hypothesis that the performance of an examinee is similar over two item sets versus the alternative hypothesis that the performance is better on one of the item sets. We suggest, to…
Descriptors: Probability, Bayesian Statistics, Cheating, Statistical Analysis
Sinharay, Sandip – Grantee Submission, 2021
Drasgow, Levine, and Zickar (1996) suggested a statistic based on the Neyman-Pearson lemma (e.g., Lehmann & Romano, 2005, p. 60) for detecting preknowledge on a known set of items. The statistic is a special case of the optimal appropriateness indices of Levine and Drasgow (1988) and is the most powerful statistic for detecting item…
Descriptors: Robustness (Statistics), Hypothesis Testing, Statistics, Test Items
Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2019
According to Wollack and Schoenig (2018), score differencing is one of six types of statistical methods used to detect test fraud. In this paper, we suggested the use of Bayes factors (e.g., Kass & Raftery, 1995) for score differencing. A simulation study shows that the suggested approach performs slightly better than an existing frequentist…
Descriptors: Cheating, Deception, Statistical Analysis, Bayesian Statistics
Sinharay, Sandip – Grantee Submission, 2019
Benefiting from item preknowledge (e.g., McLeod, Lewis, & Thissen, 2003) is a major type of fraudulent behavior during educational assessments. This paper suggests a new statistic that can be used for detecting the examinees who may have benefitted from item preknowledge using their response times. The statistic quantifies the difference in…
Descriptors: Test Items, Cheating, Reaction Time, Identification
Sinharay, Sandip; Duong, Minh Q.; Wood, Scott W. – Journal of Educational Measurement, 2017
As noted by Fremer and Olson, analysis of answer changes is often used to investigate testing irregularities because the analysis is readily performed and has proven its value in practice. Researchers such as Belov, Sinharay and Johnson, van der Linden and Jeon, van der Linden and Lewis, and Wollack, Cohen, and Eckerly have suggested several…
Descriptors: Identification, Statistics, Change, Tests
Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2019
According to Wollack and Schoenig (2018), benefitting from item preknowledge is one of the three broad types of test fraud that occur in educational assessments. We use tools from constrained statistical inference to suggest a new statistic that is based on item scores and response times and can be used to detect the examinees who may have…
Descriptors: Scores, Test Items, Reaction Time, Cheating
Sinharay, Sandip; Johnson, Matthew S. – Educational and Psychological Measurement, 2017
In a pioneering research article, Wollack and colleagues suggested the "erasure detection index" (EDI) to detect test tampering. The EDI can be used with or without a continuity correction and is assumed to follow the standard normal distribution under the null hypothesis of no test tampering. When used without a continuity correction,…
Descriptors: Deception, Identification, Testing Problems, Error of Measurement
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2017
An increasing concern of producers of educational assessments is fraudulent behavior during the assessment (van der Linden, 2009). Benefiting from item preknowledge (e.g., Eckerly, 2017; McLeod, Lewis, & Thissen, 2003) is one type of fraudulent behavior. This article suggests two new test statistics for detecting individuals who may have…
Descriptors: Test Items, Cheating, Testing Problems, Identification
Sinharay, Sandip – Grantee Submission, 2018
Tatsuoka (1984) suggested several extended caution indices and their standardized versions that have been used as person-fit statistics by researchers such as Drasgow, Levine, and McLaughlin (1987), Glas and Meijer (2003), and Molenaar and Hoijtink (1990). However, these indices are only defined for tests with dichotomous items. This paper extends…
Descriptors: Test Format, Goodness of Fit, Item Response Theory, Error Patterns
Sinharay, Sandip – Journal of Educational and Behavioral Statistics, 2016
Meijer and van Krimpen-Stoop noted that the number of person-fit statistics (PFSs) that have been designed for computerized adaptive tests (CATs) is relatively modest. This article partially addresses that concern by suggesting three new PFSs for CATs. The statistics are based on tests for a change point and can be used to detect an abrupt change…
Descriptors: Computer Assisted Testing, Adaptive Testing, Item Response Theory, Goodness of Fit
Sinharay, Sandip; Wan, Ping; Choi, Seung W.; Kim, Dong-In – Journal of Educational Measurement, 2015
With an increase in the number of online tests, the number of interruptions during testing due to unexpected technical issues seems to be on the rise. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. Researchers such as…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Statistical Analysis
Sinharay, Sandip; Wan, Ping; Whitaker, Mike; Kim, Dong-In; Zhang, Litong; Choi, Seung W. – Journal of Educational Measurement, 2014
With an increase in the number of online tests, interruptions during testing due to unexpected technical issues seem unavoidable. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. There is a lack of research on this…
Descriptors: Computer Assisted Testing, Testing Problems, Scores, Regression (Statistics)
Sinharay, Sandip; Haberman, Shelby – Educational Testing Service, 2011
Recently, the literature has seen increasing interest in subscores for their potential diagnostic values; for example, one study suggested the report of weighted averages of a subscore and the total score, whereas others showed, for various operational and simulated data sets, that weighted averages, as compared to subscores, lead to more accurate…
Descriptors: Equated Scores, Weighted Scores, Tests, Statistical Analysis