ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	11

Source

ETS Research Report Series	3
Journal of Educational…	3
Grantee Submission	2
Educational Measurement:…	1
Educational Testing Service	1
Language Testing	1
Psychometrika	1

Author

Sinharay, Sandip	12
Haberman, Shelby J.	3
Choi, Seung W.	2
Feng, Ying	2
Haberman, Shelby	2
Johnson, Matthew S.	2
Kim, Dong-In	2
Lee, Yi-Hsuan	2
Powers, Donald E.	2
Puhan, Gautam	2
Saldivia, Luis	2
Simpson, Annabelle	2
Wan, Ping	2
Weng, Vincent	2
Ginuta, Anthony	1
Giunta, Anthony	1
Larkin, Kevin	1
Whitaker, Mike	1
Zhang, Litong	1
More ▼

Publication Type

Reports - Research	10
Journal Articles	9
Reports - Evaluative	2
Tests/Questionnaires	1

Education Level

Elementary Education	1
Elementary Secondary Education	1
High Schools	1
Secondary Education	1

Audience

Location

Chile	2
Colombia	2
Ecuador	2

Laws, Policies, & Programs

Assessments and Surveys

Indiana Statewide Testing for…	2
Test of English for…	1

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Detecting Test Fraud Using Bayes Factors

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2019

According to Wollack and Schoenig (2018), score differencing is one of six types of statistical methods used to detect test fraud. In this paper, we suggested the use of Bayes factors (e.g., Kass & Raftery, 1995) for score differencing. A simulation study shows that the suggested approach performs slightly better than an existing frequentist…

Descriptors: Cheating, Deception, Statistical Analysis, Bayesian Statistics

The Use of Item Scores and Response Times to Detect Examinees Who May Have Benefited from Item Preknowledge

Peer reviewed
PDF on ERIC

Download full text

Direct link

Sinharay, Sandip; Johnson, Matthew S. – Grantee Submission, 2019

According to Wollack and Schoenig (2018), benefitting from item preknowledge is one of the three broad types of test fraud that occur in educational assessments. We use tools from constrained statistical inference to suggest a new statistic that is based on item scores and response times and can be used to detect the examinees who may have…

Descriptors: Scores, Test Items, Reaction Time, Cheating

Assessing Individual-Level Impact of Interruptions during Online Testing

Peer reviewed

Direct link

Sinharay, Sandip; Wan, Ping; Choi, Seung W.; Kim, Dong-In – Journal of Educational Measurement, 2015

With an increase in the number of online tests, the number of interruptions during testing due to unexpected technical issues seems to be on the rise. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. Researchers such as…

Descriptors: Computer Assisted Testing, Testing Problems, Scores, Statistical Analysis

Determining the Overall Impact of Interruptions during Online Testing

Peer reviewed

Direct link

Sinharay, Sandip; Wan, Ping; Whitaker, Mike; Kim, Dong-In; Zhang, Litong; Choi, Seung W. – Journal of Educational Measurement, 2014

With an increase in the number of online tests, interruptions during testing due to unexpected technical issues seem unavoidable. For example, interruptions occurred during several recent state tests. When interruptions occur, it is important to determine the extent of their impact on the examinees' scores. There is a lack of research on this…

Descriptors: Computer Assisted Testing, Testing Problems, Scores, Regression (Statistics)

When Does Scale Anchoring Work? A Case Study

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby J.; Lee, Yi-Hsuan – Journal of Educational Measurement, 2011

Providing information to test takers and test score users about the abilities of test takers at different score levels has been a persistent problem in educational and psychological measurement. Scale anchoring, a technique which describes what students at different points on a score scale know and can do, is a tool to provide such information.…

Descriptors: Scores, Test Items, Statistical Analysis, Licensing Examinations (Professions)

Statistical Procedures to Evaluate Quality of Scale Anchoring. Research Report. ETS RR-11-02

Download full text

Haberman, Shelby J.; Sinharay, Sandip; Lee, Yi-Hsuan – Educational Testing Service, 2011

Providing information to test takers and test score users about the abilities of test takers at different score levels has been a persistent problem in educational and psychological measurement (Carroll, 1993). Scale anchoring (Beaton & Allen, 1992), a technique that describes what students at different points on a score scale know and can do,…

Descriptors: Statistical Analysis, Scores, Regression (Statistics), Item Response Theory

Reporting of Subscores Using Multidimensional Item Response Theory

Peer reviewed

Direct link

Haberman, Shelby J.; Sinharay, Sandip – Psychometrika, 2010

Recently, there has been increasing interest in reporting subscores. This paper examines reporting of subscores using multidimensional item response theory (MIRT) models (e.g., Reckase in "Appl. Psychol. Meas." 21:25-36, 1997; C.R. Rao and S. Sinharay (Eds), "Handbook of Statistics, vol. 26," pp. 607-642, North-Holland, Amsterdam, 2007; Beguin &…

Descriptors: Item Response Theory, Psychometrics, Statistical Analysis, Scores

Comparison of Subscores Based on Classical Test Theory Methods. Research Report. ETS RR-08-54

Peer reviewed
PDF on ERIC

Download full text

Puhan, Gautam; Sinharay, Sandip; Haberman, Shelby; Larkin, Kevin – ETS Research Report Series, 2008

Will reporting subscores provide any additional information than the total score? Is there a method that can be used to provide more trustworthy subscores than observed subscores? These 2 questions are addressed in this study. To answer the 2nd question, 2 subscore estimation methods (i.e., subscore estimated from the observed total score or…

Descriptors: Comparative Analysis, Scores, Tests, Certification

Subscores Based on Classical Test Theory: To Report or Not to Report

Peer reviewed

Direct link

Sinharay, Sandip; Haberman, Shelby; Puhan, Gautam – Educational Measurement: Issues and Practice, 2007

There is an increasing interest in reporting subscores, both at examinee level and at aggregate levels. However, it is important to ensure reasonable subscore performance in terms of high reliability and validity to minimize incorrect instructional and remediation decisions. This article employs a statistical measure based on classical test theory…

Descriptors: Test Reliability, Test Theory, Test Validity, Statistical Analysis

Appropriateness of the TOEIC[R] Bridge Test for Students in Three Countries of South America

Peer reviewed

Direct link

Sinharay, Sandip; Powers, Donald E.; Feng, Ying; Saldivia, Luis; Giunta, Anthony; Simpson, Annabelle; Weng, Vincent – Language Testing, 2009

In order to facilitate the interpretation of test scores from the TOEIC[R] "Bridge" as a measure of English language proficiency, one form of the test was administered to more than 6000 test takers in three South American countries--Colombia, Chile and Ecuador. The appropriateness of the TOEIC "Bridge" test as a measure of…

Descriptors: Factor Analysis, Foreign Countries, Language Skills, English (Second Language)

Establishing the Validity of TOEIC Bridge™ Test Scores for Students in Colombia, Chile, and Ecuador. Research Report. ETS RR-08-58

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip; Feng, Ying; Saldivia, Luis; Powers, Donald E.; Ginuta, Anthony; Simpson, Annabelle; Weng, Vincent – ETS Research Report Series, 2008

The validity of TOEIC Bridge™ scores as a measure of English language skill was examined from the standpoint of a unified concept of test validity. In this study, more than 6,000 test takers in 3 Latin American countries (Chile, Colombia, and Ecuador) took 1 form of the TOEIC Bridge test, and their scores were compared to additional information…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Test Validity

Model Diagnostics for Bayesian Networks. Research Report. ETS RR-04-17

Peer reviewed
PDF on ERIC

Download full text

Sinharay, Sandip – ETS Research Report Series, 2004

Assessing fit of psychometric models has always been an issue of enormous interest, but there exists no unanimously agreed upon item fit diagnostic for the models. Bayesian networks, frequently used in educational assessments (see, for example, Mislevy, Almond, Yan, & Steinberg, 2001) primarily for learning about students' knowledge and…

Descriptors: Bayesian Statistics, Networks, Models, Goodness of Fit

Scores	12
Statistical Analysis	12
Item Response Theory	6
Test Items	5
Comparative Analysis	4
Correlation	3
Licensing Examinations…	3
Regression (Statistics)	3
Achievement Tests	2
Bayesian Statistics	2
Cheating	2
Computer Assisted Testing	2
Elementary School Students	2
English (Second Language)	2
Factor Analysis	2
Foreign Countries	2
Goodness of Fit	2
Hypothesis Testing	2
Identification	2
Language Skills	2
Language Tests	2
Models	2
Psychometrics	2
Second Language Learning	2
Self Evaluation (Individuals)	2
More ▼