ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	4
Since 2006 (last 20 years)	13

Descriptor

Goodness of Fit	14
Reliability	8
Test Reliability	6
Item Response Theory	5
Error of Measurement	4
Test Construction	4
Computation	3
Correlation	3
Psychometrics	3
Scoring	3
Test Items	3
Test Validity	3
Academic Standards	2
Accuracy	2
Classification	2
Cutting Scores	2
English	2
Equated Scores	2
Evaluation Methods	2
Factor Analysis	2
Interrater Reliability	2
Mathematics Achievement	2
Measurement Techniques	2
Measures (Individuals)	2
Program Validation	2
More ▼

Source

Educational and Psychological…	2
New Mexico Public Education…	2
Structural Equation Modeling:…	2
Australian Educational…	1
British Educational Research…	1
Educational Assessment,…	1
Educational Testing Service	1
Journal of Chemical Education	1
Measurement and Evaluation in…	1
Measurement:…	1

Publication Type

Reports - Descriptive	14
Journal Articles	10
Numerical/Quantitative Data	2
Non-Print Media	1
Opinion Papers	1

Education Level

Elementary Secondary Education	2
Higher Education	2
Postsecondary Education	1

Audience

Researchers

Location

New Mexico	2
Australia	1

Laws, Policies, & Programs

Assessments and Surveys

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Item Response Theory: A Modern Measurement Approach to Reliability and Precision for Counseling Researchers

Peer reviewed

Direct link

Ryan M. Cook; Stefanie A. Wind – Measurement and Evaluation in Counseling and Development, 2024

The purpose of this article is to discuss reliability and precision through the lens of a modern measurement approach, item response theory (IRT). Reliability evidence in the field of counseling is primarily generated using Classical Test Theory (CTT) approaches, although recent studies in the field of counseling have shown the benefits of using…

Descriptors: Item Response Theory, Measurement, Reliability, Accuracy

On the Added Value of Multiple Factor Score Estimates in Essentially Unidimensional Models

Peer reviewed

Direct link

Ferrando, Pere J.; Lorenzo-Seva, Urbano – Educational and Psychological Measurement, 2019

Measures initially designed to be single-trait often yield data that are compatible with both an essentially unidimensional factor-analysis (FA) solution and a correlated-factors solution. For these cases, this article proposes an approach aimed at providing information for deciding which of the two solutions is the most appropriate and useful.…

Descriptors: Factor Analysis, Computation, Reliability, Goodness of Fit

R Packages for Item Response Theory Analysis: Descriptions and Features

Peer reviewed

Direct link

Choi, Youn-Jeng; Asilkalkan, Abdullah – Measurement: Interdisciplinary Research and Perspectives, 2019

About 45 R packages to analyze data using item response theory (IRT) have been developed over the last decade. This article introduces these 45 R packages with their descriptions and features. It also describes possible advanced IRT models using R packages, as well as dichotomous and polytomous IRT models, and R packages that contain applications…

Descriptors: Item Response Theory, Data Analysis, Computer Software, Test Bias

Linear or Nonlinear Least-Squares Analysis of Kinetic Data?

Peer reviewed

Direct link

Perrin, Charles L. – Journal of Chemical Education, 2017

The disadvantages of the usual linear least-squares analysis of first- and second-order kinetic data are described, and nonlinear least-squares fitting is recommended as an alternative.

Descriptors: Kinetics, Least Squares Statistics, Alternative Assessment, Goodness of Fit

Rasch Measurement: A Response to Payanides, Robinson and Tymms

Peer reviewed

Direct link

Goldstein, Harvey – British Educational Research Journal, 2015

A response is made to a paper that urges the use of the Rasch model for educational assessment. This paper argues that the model is inadequate and that claims for its efficacy are exaggerated and technically weak.

Descriptors: Reader Response, Item Response Theory, Educational Assessment, Evaluation Methods

Equating a Large-Scale Writing Assessment Using Pairwise Comparisons of Performances

Peer reviewed

Direct link

Humphry, Stephen M.; McGrane, Joshua A. – Australian Educational Researcher, 2015

This paper presents a method for equating writing assessments using pairwise comparisons which does not depend upon conventional common-person or common-item equating designs. Pairwise comparisons have been successfully applied in the assessment of open-ended tasks in English and other areas such as visual art and philosophy. In this paper,…

Descriptors: Writing Evaluation, Evaluation Methods, Comparative Analysis, Writing Tests

Maximum Likelihood Item Easiness Models for Test Theory without an Answer Key

Peer reviewed

Direct link

France, Stephen L.; Batchelder, William H. – Educational and Psychological Measurement, 2015

Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce…

Descriptors: Maximum Likelihood Statistics, Test Items, Difficulty Level, Test Theory

The Effect of Error Correlation on Interfactor Correlation in Psychometric Measurement

Peer reviewed

Direct link

Westfall, Peter H.; Henning, Kevin S. S.; Howell, Roy D. – Structural Equation Modeling: A Multidisciplinary Journal, 2012

This article shows how interfactor correlation is affected by error correlations. Theoretical and practical justifications for error correlations are given, and a new equivalence class of models is presented to explain the relationship between interfactor correlation and error correlations. The class allows simple, parsimonious modeling of error…

Descriptors: Psychometrics, Correlation, Error of Measurement, Structural Equation Models

Sources of Score Scale Inconsistency. Research Report. ETS RR-11-10

Download full text

Haberman, Shelby J.; Dorans, Neil J. – Educational Testing Service, 2011

For testing programs that administer multiple forms within a year and across years, score equating is used to ensure that scores can be used interchangeably. In an ideal world, samples sizes are large and representative of populations that hardly change over time, and very reliable alternate test forms are built with nearly identical psychometric…

Descriptors: Scores, Reliability, Equated Scores, Test Construction

Finding Autonomy in Activity: Development and Validation of a Democratic Classroom Survey

Peer reviewed

Direct link

Hur, Eun Hye; Glassman, Michael; Kim, Yunhwan – Educational Assessment, Evaluation and Accountability, 2013

This paper developed a Democratic Classroom Survey to measure students' perceived democratic environment of the classroom. Perceived democratic environment is one of the most important variables for understanding classroom activity and indeed any type of group activity, but actually measuring perceptions in an objective manner has been…

Descriptors: Classroom Environment, Test Construction, Program Validation, Democratic Values

New Mexico Standards-Based Assessment Technical Report: Spring 2007 Administration

Download full text

New Mexico Public Education Department, 2007

The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2007 NMSBA. The 2007 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Summary of student performance; (4) Statistical analyses of item and…

Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring

Interval Estimation of Optimal Scores from Multiple-Component Measuring Instruments via SEM

Peer reviewed

Direct link

Raykov, Tenko – Structural Equation Modeling: A Multidisciplinary Journal, 2006

A structural equation modeling based method is outlined that accomplishes interval estimation of individual optimal scores resulting from multiple-component measuring instruments evaluating single underlying latent dimensions. The procedure capitalizes on the linear combination of a prespecified set of measures that is associated with maximal…

Descriptors: Scores, Structural Equation Models, Reliability, Validity

New Mexico Standards Based Assessment (NMSBA) Technical Report: 2006 Spring Administration

Download full text

Griph, Gerald W. – New Mexico Public Education Department, 2006

The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2006 NMSBA. The 2006 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Calibration, scaling, and equating procedures; (4) Standard setting;…

Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring

Program STANDARD (Statistic of Conjoint Multiple Observer Agreement with a Standard).

Download full text

McDermott, Paul A.; Watkins, Marley W. – 1979

A computer program named Program STANDARD is presented and demonstrated. This program calculates the statistical significance of the overall agreement of the categorical assignments. The program is based on Light's statistic, G, for describing the conjoint agreement of many observers with correct or standard set of classifications on nominal…

Descriptors: Classification, Computer Programs, Goodness of Fit, Nonparametric Statistics

Asilkalkan, Abdullah	1
Batchelder, William H.	1
Choi, Youn-Jeng	1
Dorans, Neil J.	1
Ferrando, Pere J.	1
France, Stephen L.	1
Glassman, Michael	1
Goldstein, Harvey	1
Griph, Gerald W.	1
Haberman, Shelby J.	1
Henning, Kevin S. S.	1
Howell, Roy D.	1
Humphry, Stephen M.	1
Hur, Eun Hye	1
Kim, Yunhwan	1
Lorenzo-Seva, Urbano	1
McDermott, Paul A.	1
McGrane, Joshua A.	1
Perrin, Charles L.	1
Raykov, Tenko	1
Ryan M. Cook	1
Stefanie A. Wind	1
Watkins, Marley W.	1
Westfall, Peter H.	1
More ▼