Publication Date
In 2025 | 0 |
Since 2024 | 2 |
Since 2021 (last 5 years) | 7 |
Since 2016 (last 10 years) | 19 |
Since 2006 (last 20 years) | 48 |
Descriptor
Source
Author
Hambleton, Ronald K. | 13 |
Livingston, Samuel A. | 8 |
Brennan, Robert L. | 6 |
Wilcox, Rand R. | 5 |
Huynh, Huynh | 4 |
Kane, Michael T. | 4 |
Roid, Gale | 4 |
Roudabush, Glenn E. | 4 |
Subkoviak, Michael J. | 4 |
Tindal, Gerald | 4 |
Baker, Eva L. | 3 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 15 |
Practitioners | 14 |
Teachers | 7 |
Administrators | 3 |
Parents | 2 |
Counselors | 1 |
Students | 1 |
Support Staff | 1 |
Location
Australia | 8 |
Illinois | 4 |
Florida | 3 |
Georgia | 3 |
Tennessee | 3 |
Texas | 3 |
Canada | 2 |
Colorado | 2 |
Iran | 2 |
Michigan | 2 |
Minnesota (Saint Paul) | 2 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 3 |
Elementary and Secondary… | 2 |
No Child Left Behind Act 2001 | 2 |
Early Head Start | 1 |
Every Student Succeeds Act… | 1 |
Individuals with Disabilities… | 1 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Ivens, Stephen H. – 1972
A discussion of criterion-referenced measures is presented. Two characteristics define the criterion-referenced measure: the presence of a performance criterion, and test items keyed to a set of behavioral objectives. The performance criterion, in an educational setting, is usually a relative standard of performance. There are two ways of…
Descriptors: Behavioral Objectives, Criterion Referenced Tests, Item Analysis, Performance Criteria
The Effect of Violating the Assumption of Equal Item Means in Estimating the Livingston Coefficient.

Lovett, Hubert T. – Educational and Psychological Measurement, 1978
The validity of five methods of estimating the reliability of criterion-referenced tests was evaluated across nine conditions of variability among item means. The results were analyzed by analysis of variance, the Newman-Keuls test, and a nonparametric procedure. There was a tendency for all of the methods to be conservative. (Author/JKS)
Descriptors: Analysis of Variance, Criterion Referenced Tests, Item Analysis, Nonparametric Statistics

Wilcox, Rand R. – Psychometrika, 1978
Several Bayesian approaches to the simultaneous estimation of the means of k binomial populations are discussed. This has particular applicability to criterion-referenced or mastery testing. (Author/JKS)
Descriptors: Bayesian Statistics, Criterion Referenced Tests, Mastery Tests, Probability

Livingston, Samuel A. – Journal of Educational Measurement, 1972
Author replies to article TM 500 559. (MB)
Descriptors: Criterion Referenced Tests, Measurement Techniques, Norm Referenced Tests, Scoring

Raju, Nambury S. – Educational and Psychological Measurement, 1982
Rajaratnam, Cronbach and Gleser's generalizability formula for stratified-parallel tests and Raju's coefficient beta are generalized to estimate the reliability of a composite of criterion-referenced tests, where the parts have different cutting scores. (Author/GK)
Descriptors: Criterion Referenced Tests, Cutting Scores, Mathematical Formulas, Scoring Formulas

Goodstein, H. A. – Journal of Special Education, 1982
A review of alternative methodologies and a conceptual framework for the study of reliability of criterion-referenced tests are presented. The possibility of aptitude-x-assessment interactions is considered and implications are discussed. (Author)
Descriptors: Criterion Referenced Tests, Disabilities, Elementary Secondary Education, Research Methodology

Chase, Clint – Mid-Western Educational Researcher, 1996
Classical procedures for calculating the two indices of decision consistency (P and Kappa) for criterion-referenced tests require two testings on each child. Huynh, Peng, and Subkoviak have presented one-testing procedures for these indices. These indices can be estimated without any test administration using Ebel's estimates of the mean, standard…
Descriptors: Criterion Referenced Tests, Educational Research, Educational Testing, Estimation (Mathematics)
Moyer, Judith E.; Fishbein, Ronald L. – 1977
The problem that this research addressed was one of decision making. Given three sets of criterion-referenced tests which were designed to be parallel in content, would a traditional reliability coefficient produce different decisions about the reliability of those tests than would kappa? The procedure used collected statewide results on 136 test…
Descriptors: Analysis of Variance, Comparative Analysis, Criterion Referenced Tests, Measurement Techniques
Sanders, James R. – 1976
Applied Performance Tests (APT) are defined as instruments designed to measure performance in an actual or simulated setting. They require at least a close approximation of the setting (if not the actual setting) to which the performance is expected to be transferred. This paper outlines measurement problems and issues that are unique to APT. It…
Descriptors: Criterion Referenced Tests, Elementary Secondary Education, Measurement, Performance Tests
Educational Testing Service, Princeton, NJ. – 1973
A filmstrip with associated audio track has been developed to cover the major planning steps in the development of a measurement instrument such as a test or questionnaire. The filmstrip addresses the following six questions: Why am I testing? What should I test? Whom am I testing? What kinds of questions should I use? How long should my test be?…
Descriptors: Criterion Referenced Tests, Filmstrips, Guides, Instructional Films
Woodson, M. I. Charles E.
The item (difficulty and discrimination) and test (reliability and validity) statistics in classical test theory are highly dependent upon the calibration sample of individuals used. The estimates of item and test parameters in classical test theory is valid within a range of interest along the characteristic measured. Generally, this range of…
Descriptors: Criterion Referenced Tests, Item Analysis, Research Reports, Statistics
Randall, Robert S. – 1972
Differences in design between norm referenced measures (NRM) and criterion referenced measures (CRM) are reviewed, and some of the procedures proposed on designing and evaluating CRM are examined. Differences in design of NRM and CRM are said to arise from the different purposes that underlie each measure. In addition, there are differences among…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Norm Referenced Tests, Test Construction
Willoughby, Lee; And Others – 1976
This study compared a domain referenced approach with a traditional psychometric approach in the construction of a test. Results of the December, 1975 Quarterly Profile Exam (QPE) administered to 400 examinees at a university were the source of data. The 400 item QPE is a five alternative multiple choice test of information a "safe"…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Norm Referenced Tests, Statistical Analysis

Kane, Michael T. – Journal of Educational Measurement, 1986
These analyses suggest that if a criterion-referenced test had a reliability (defined in terms of internal consistency) below 0.5, a simple a priori procedure would provide better estimates of students' universe scores than would individual observed scores. (Author/LMO)
Descriptors: Criterion Referenced Tests, Educational Research, Error of Measurement, Generalizability Theory
Paradowski, Michal B. – Online Submission, 2002
The paper discusses the key criteria of good language tests: practicality, validity, and reliability.
Descriptors: Language Tests, Criterion Referenced Tests, Test Reliability, Test Validity