ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	12

Descriptor

Item Response Theory	12
Test Items	9
Difficulty Level	4
Item Analysis	4
Scores	4
Statistical Analysis	4
Computation	3
Effect Size	3
Reaction Time	3
Sample Size	3
Weighted Scores	3
Data Analysis	2
Nonparametric Statistics	2
Psychometrics	2
Scoring	2
Simulation	2
Test Bias	2
Test Length	2
Test Reliability	2
Test Wiseness	2
Testing	2
True Scores	2
Accuracy	1
College Freshmen	1
College Students	1
More ▼

Source

ETS Research Report Series	9
Journal of Educational…	2
Applied Measurement in…	1

Author

Guo, Hongwen	12
Dorans, Neil J.	4
Ling, Guangming	2
Lu, Ru	2
Rios, Joseph A.	2
Dorans, Neil	1
Ercikan, Kadriye	1
Frankel, Lois	1
Gu, Lin	1
Haberman, Shelby	1
Johnson, Matthew S.	1
Kyllonen, Patrick	1
Liu, Lydia O.	1
Liu, Ou Lydia	1
McCaffrey, Dan F.	1
Paek, Insu	1
Robin, Frederic	1
Schmitt, Neal	1
Wang, Jing	1
Wang, Zhen	1
Yang, Zhitong	1
Zu, Jiyun	1
More ▼

Publication Type

Journal Articles	12
Reports - Research	12

Education Level

Higher Education	2
Postsecondary Education	2

Audience

Location

Canada	1
United States	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 12 results Save | Export

Alternative Methods for Item Parameter Estimation: From CTT to IRT. Research Report. ETS RR-22-12

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Lu, Ru; Johnson, Matthew S.; McCaffrey, Dan F. – ETS Research Report Series, 2022

It is desirable for an educational assessment to be constructed of items that can differentiate different performance levels of test takers, and thus it is important to estimate accurately the item discrimination parameters in either classical test theory or item response theory. It is particularly challenging to do so when the sample sizes are…

Descriptors: Test Items, Item Response Theory, Item Analysis, Educational Assessment

Using Weighted Sum Scores to Close the Gap between DIF Practice and Theory

Peer reviewed

Direct link

Guo, Hongwen; Dorans, Neil J. – Journal of Educational Measurement, 2020

We make a distinction between the operational practice of using an observed score to assess differential item functioning (DIF) and the concept of departure from measurement invariance (DMI) that conditions on a latent variable. DMI and DIF indices of effect sizes, based on the Mantel-Haenszel test of common odds ratio, converge under restricted…

Descriptors: Weighted Scores, Test Items, Item Response Theory, Measurement

Influence of Selected-Response Format Variants on Test Characteristics and Test-Taking Effort: An Empirical Study. Research Report. ETS RR-22-01

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Rios, Joseph A.; Ling, Guangming; Wang, Zhen; Gu, Lin; Yang, Zhitong; Liu, Lydia O. – ETS Research Report Series, 2022

Different variants of the selected-response (SR) item type have been developed for various reasons (i.e., simulating realistic situations, examining critical-thinking and/or problem-solving skills). Generally, the variants of SR item format are more complex than the traditional multiple-choice (MC) items, which may be more challenging to test…

Descriptors: Test Format, Test Wiseness, Test Items, Item Response Theory

Observed Scores as Matching Variables in Differential Item Functioning under the One- and Two-Parameter Logistic Models: Population Results. Research Report. ETS RR-19-06

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019

We derive formulas for the differential item functioning (DIF) measures that two routinely used DIF statistics are designed to estimate. The DIF measures that match on observed scores are compared to DIF measures based on an unobserved ability (theta or true score) for items that are described by either the one-parameter logistic (1PL) or…

Descriptors: Scores, Test Bias, Statistical Analysis, Item Response Theory

Comparing Test-Taking Behaviors of English Language Learners (ELLs) to Non-ELL Students: Use of Response Time in Measurement Comparability Research. Research Report. ETS RR-21-25

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Ercikan, Kadriye – ETS Research Report Series, 2021

In this report, we demonstrate use of differential response time (DRT) methodology, an extension of differential item functioning methodology, for examining differences in how students from different backgrounds engage with assessment tasks. We analyze response time data from a digitally delivered mathematics assessment to examine timing…

Descriptors: Test Wiseness, English Language Learners, Reaction Time, Mathematics Tests

A Note on Using Weighted Sum Scores in the P-DIF Statistic. Research Report. ETS RR-19-32

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2019

The Mantel-Haenszel delta difference (MH D-DIF) and the standardized proportion difference (STD P-DIF) are two observed-score methods that have been used to assess differential item functioning (DIF) at Educational Testing Service since the early 1990s. Latentvariable approaches to assessing measurement invariance at the item level have been…

Descriptors: Test Bias, Educational Testing, Statistical Analysis, Item Response Theory

Using Existing Data to Inform Development of New Item Types. Research Report. ETS RR-20-01

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Ling, Guangming; Frankel, Lois – ETS Research Report Series, 2020

With advances in technology, researchers and test developers are developing new item types to measure complex skills like problem solving and critical thinking. Analyzing such items is often challenging because of their complicated response patterns, and thus it is important to develop psychometric methods for practitioners and researchers to…

Descriptors: Test Construction, Test Items, Item Analysis, Psychometrics

Robustness of Weighted Differential Item Functioning (DIF) Analysis: The Case of Mantel-Haenszel DIF Statistics. Research Report. ETS RR-21-12

Peer reviewed
PDF on ERIC

Download full text

Lu, Ru; Guo, Hongwen; Dorans, Neil J. – ETS Research Report Series, 2021

Two families of analysis methods can be used for differential item functioning (DIF) analysis. One family is DIF analysis based on observed scores, such as the Mantel-Haenszel (MH) and the standardized proportion-correct metric for DIF procedures; the other is analysis based on latent ability, in which the statistic is a measure of departure from…

Descriptors: Robustness (Statistics), Weighted Scores, Test Items, Item Analysis

Detecting Item Drift in Large-Scale Testing

Peer reviewed

Direct link

Guo, Hongwen; Robin, Frederic; Dorans, Neil – Journal of Educational Measurement, 2017

The early detection of item drift is an important issue for frequently administered testing programs because items are reused over time. Unfortunately, operational data tend to be very sparse and do not lend themselves to frequent monitoring analyses, particularly for on-demand testing. Building on existing residual analyses, the authors propose…

Descriptors: Testing, Test Items, Identification, Sample Size

Exploring Online Learning Data Using Fractal Dimensions. Research Report. ETS RR-17-15

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen – ETS Research Report Series, 2017

Data collected from online learning and tutoring systems for individual students showed strong autocorrelation or dependence because of content connection, knowledge-based dependency, or persistence of learning behavior. When the response data show little dependence or negative autocorrelations for individual students, it is suspected that…

Descriptors: Data Collection, Electronic Learning, Intelligent Tutoring Systems, Information Utilization

Evaluation of Different Scoring Rules for a Noncognitive Test in Development. Research Report. ETS RR-16-03

Peer reviewed
PDF on ERIC

Download full text

Guo, Hongwen; Zu, Jiyun; Kyllonen, Patrick; Schmitt, Neal – ETS Research Report Series, 2016

In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well-developed item with appropriate keys (i.e., the correct answers), agreement among various…

Descriptors: Scoring, Test Reliability, Statistical Analysis, Psychometrics

A New Procedure for Detection of Students' Rapid Guessing Responses Using Response Time

Peer reviewed

Direct link

Guo, Hongwen; Rios, Joseph A.; Haberman, Shelby; Liu, Ou Lydia; Wang, Jing; Paek, Insu – Applied Measurement in Education, 2016

Unmotivated test takers using rapid guessing in item responses can affect validity studies and teacher and institution performance evaluation negatively, making it critical to identify these test takers. The authors propose a new nonparametric method for finding response-time thresholds for flagging item responses that result from rapid-guessing…

Descriptors: Guessing (Tests), Reaction Time, Nonparametric Statistics, Models