Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 4 |
Since 2006 (last 20 years) | 13 |
Descriptor
Goodness of Fit | 14 |
Reliability | 8 |
Test Reliability | 6 |
Item Response Theory | 5 |
Error of Measurement | 4 |
Test Construction | 4 |
Computation | 3 |
Correlation | 3 |
Psychometrics | 3 |
Scoring | 3 |
Test Items | 3 |
More ▼ |
Source
Author
Publication Type
Reports - Descriptive | 14 |
Journal Articles | 10 |
Numerical/Quantitative Data | 2 |
Non-Print Media | 1 |
Opinion Papers | 1 |
Education Level
Elementary Secondary Education | 2 |
Higher Education | 2 |
Postsecondary Education | 1 |
Audience
Researchers | 1 |
Location
New Mexico | 2 |
Australia | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Ryan M. Cook; Stefanie A. Wind – Measurement and Evaluation in Counseling and Development, 2024
The purpose of this article is to discuss reliability and precision through the lens of a modern measurement approach, item response theory (IRT). Reliability evidence in the field of counseling is primarily generated using Classical Test Theory (CTT) approaches, although recent studies in the field of counseling have shown the benefits of using…
Descriptors: Item Response Theory, Measurement, Reliability, Accuracy
Ferrando, Pere J.; Lorenzo-Seva, Urbano – Educational and Psychological Measurement, 2019
Measures initially designed to be single-trait often yield data that are compatible with both an essentially unidimensional factor-analysis (FA) solution and a correlated-factors solution. For these cases, this article proposes an approach aimed at providing information for deciding which of the two solutions is the most appropriate and useful.…
Descriptors: Factor Analysis, Computation, Reliability, Goodness of Fit
Choi, Youn-Jeng; Asilkalkan, Abdullah – Measurement: Interdisciplinary Research and Perspectives, 2019
About 45 R packages to analyze data using item response theory (IRT) have been developed over the last decade. This article introduces these 45 R packages with their descriptions and features. It also describes possible advanced IRT models using R packages, as well as dichotomous and polytomous IRT models, and R packages that contain applications…
Descriptors: Item Response Theory, Data Analysis, Computer Software, Test Bias
Perrin, Charles L. – Journal of Chemical Education, 2017
The disadvantages of the usual linear least-squares analysis of first- and second-order kinetic data are described, and nonlinear least-squares fitting is recommended as an alternative.
Descriptors: Kinetics, Least Squares Statistics, Alternative Assessment, Goodness of Fit
Goldstein, Harvey – British Educational Research Journal, 2015
A response is made to a paper that urges the use of the Rasch model for educational assessment. This paper argues that the model is inadequate and that claims for its efficacy are exaggerated and technically weak.
Descriptors: Reader Response, Item Response Theory, Educational Assessment, Evaluation Methods
Humphry, Stephen M.; McGrane, Joshua A. – Australian Educational Researcher, 2015
This paper presents a method for equating writing assessments using pairwise comparisons which does not depend upon conventional common-person or common-item equating designs. Pairwise comparisons have been successfully applied in the assessment of open-ended tasks in English and other areas such as visual art and philosophy. In this paper,…
Descriptors: Writing Evaluation, Evaluation Methods, Comparative Analysis, Writing Tests
France, Stephen L.; Batchelder, William H. – Educational and Psychological Measurement, 2015
Cultural consensus theory (CCT) is a data aggregation technique with many applications in the social and behavioral sciences. We describe the intuition and theory behind a set of CCT models for continuous type data using maximum likelihood inference methodology. We describe how bias parameters can be incorporated into these models. We introduce…
Descriptors: Maximum Likelihood Statistics, Test Items, Difficulty Level, Test Theory
Westfall, Peter H.; Henning, Kevin S. S.; Howell, Roy D. – Structural Equation Modeling: A Multidisciplinary Journal, 2012
This article shows how interfactor correlation is affected by error correlations. Theoretical and practical justifications for error correlations are given, and a new equivalence class of models is presented to explain the relationship between interfactor correlation and error correlations. The class allows simple, parsimonious modeling of error…
Descriptors: Psychometrics, Correlation, Error of Measurement, Structural Equation Models
Haberman, Shelby J.; Dorans, Neil J. – Educational Testing Service, 2011
For testing programs that administer multiple forms within a year and across years, score equating is used to ensure that scores can be used interchangeably. In an ideal world, samples sizes are large and representative of populations that hardly change over time, and very reliable alternate test forms are built with nearly identical psychometric…
Descriptors: Scores, Reliability, Equated Scores, Test Construction
Hur, Eun Hye; Glassman, Michael; Kim, Yunhwan – Educational Assessment, Evaluation and Accountability, 2013
This paper developed a Democratic Classroom Survey to measure students' perceived democratic environment of the classroom. Perceived democratic environment is one of the most important variables for understanding classroom activity and indeed any type of group activity, but actually measuring perceptions in an objective manner has been…
Descriptors: Classroom Environment, Test Construction, Program Validation, Democratic Values
New Mexico Public Education Department, 2007
The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2007 NMSBA. The 2007 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Summary of student performance; (4) Statistical analyses of item and…
Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring
Raykov, Tenko – Structural Equation Modeling: A Multidisciplinary Journal, 2006
A structural equation modeling based method is outlined that accomplishes interval estimation of individual optimal scores resulting from multiple-component measuring instruments evaluating single underlying latent dimensions. The procedure capitalizes on the linear combination of a prespecified set of measures that is associated with maximal…
Descriptors: Scores, Structural Equation Models, Reliability, Validity
Griph, Gerald W. – New Mexico Public Education Department, 2006
The purpose of the NMSBA technical report is to provide users and other interested parties with a general overview of and technical characteristics of the 2006 NMSBA. The 2006 technical report contains the following information: (1) Test development; (2) Scoring procedures; (3) Calibration, scaling, and equating procedures; (4) Standard setting;…
Descriptors: Interrater Reliability, Standard Setting, Measures (Individuals), Scoring
McDermott, Paul A.; Watkins, Marley W. – 1979
A computer program named Program STANDARD is presented and demonstrated. This program calculates the statistical significance of the overall agreement of the categorical assignments. The program is based on Light's statistic, G, for describing the conjoint agreement of many observers with correct or standard set of classifications on nominal…
Descriptors: Classification, Computer Programs, Goodness of Fit, Nonparametric Statistics