Does the Choice of Response Time Threshold Procedure Substantially Affect Inferences Concerning the Identification and Exclusion of Rapid Guessing Responses? A Meta-Analysis.

Rios, Joseph A.; Deng, Jiayi

Notes FAQ Contact Us

Back to results

Peer reviewed

Direct link

ERIC Number: EJ1305108

Record Type: Journal

Publication Date: 2021

Pages: 25

Abstractor: As Provided

ISBN: N/A

ISSN: EISSN-2196-0739

EISSN: N/A

Does the Choice of Response Time Threshold Procedure Substantially Affect Inferences Concerning the Identification and Exclusion of Rapid Guessing Responses? A Meta-Analysis

Rios, Joseph A.; Deng, Jiayi

Large-scale Assessments in Education, v9 Article 18 2021

Background: In testing contexts that are predominately concerned with power, rapid guessing (RG) has the potential to undermine the validity of inferences made from educational assessments, as such responses are unreflective of the knowledge, skills, and abilities assessed. Given this concern, practitioners/researchers have utilized a multitude of response time threshold procedures that classify RG responses in these contexts based on either the use of no empirical data (e.g., an arbitrary time limit), response time distributions, and the combination of response time and accuracy information. As there is little understanding of how these procedures compare to each other, this meta-analysis sought to investigate whether threshold typology is related to differences in descriptive, measurement property, and performance outcomes in these contexts. Methods: Studies were sampled that: (1) employed two or more response time (RT) threshold procedures to identify and exclude RG responses on the same computer-administered low-stakes power test; and (2) evaluated differences between procedures on the proportion of RG responses and responders, measurement properties, and test performance. Results: Based on as many as 86 effect sizes, our findings indicated non-negligible differences between RT threshold procedures in the proportion of RG responses and responders. The largest differences for these outcomes were observed between procedures using no empirical data and those relying on response time and accuracy information. However, these differences were not related to variability in aggregate-level measurement properties and test performance. Conclusions: When filtering RG responses to improve inferences concerning item properties and group score outcomes, the actual threshold procedure chosen may be of less importance than the act of identifying such deleterious responses. However, given the conservative nature of RT thresholds that use no empirical data, practitioners may look to avoid the use of these procedures when making inferences at the individual-level, given their potential for underclassifying RG.

Descriptors: Test Wiseness, Guessing (Tests), Reaction Time, Computer Assisted Testing, Meta Analysis, Effect Size, Accuracy, Inferences

Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://bibliotheek.ehb.be:2123/

Publication Type: Journal Articles; Information Analyses; Reports - Research

Education Level: N/A

Audience: N/A

Language: English

Sponsor: N/A

Authoring Institution: N/A

Grant or Contract Numbers: N/A