Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 2 |
Descriptor
Test Items | 38 |
Test Reliability | 30 |
Test Construction | 17 |
Item Analysis | 16 |
Test Validity | 15 |
Latent Trait Theory | 12 |
Difficulty Level | 10 |
Higher Education | 10 |
Multiple Choice Tests | 9 |
Testing Problems | 6 |
Achievement Tests | 5 |
More ▼ |
Source
Educational Administration… | 1 |
Exceptional Children | 1 |
Journal of Geography in… | 1 |
Journal of Research in… | 1 |
Language Learning & Language… | 1 |
Mental Retardation | 1 |
Author
Publication Type
Education Level
Elementary Secondary Education | 1 |
Audience
Researchers | 38 |
Practitioners | 5 |
Administrators | 3 |
Teachers | 2 |
Laws, Policies, & Programs
Assessments and Surveys
ACT Assessment | 1 |
Comprehensive Tests of Basic… | 1 |
What Works Clearinghouse Rating
Ockey, Gary J.; Wagner, Elvis – Language Learning & Language Teaching, 2018
This book is relevant for language testers, listening researchers, and oral proficiency teachers, in that it explores four broad themes related to the assessment of L2 listening ability: the use of authentic, real-world spoken texts; the effects of different speech varieties of listening inputs; the use of audio-visual texts; and assessing…
Descriptors: Listening Comprehension, Second Language Learning, Second Language Instruction, Listening Comprehension Tests
Porter, Andrew C.; Polikoff, Morgan S.; Goldring, Ellen; Murphy, Joseph; Elliott, Stephen N.; May, Henry – Educational Administration Quarterly, 2010
Research has consistently shown that principal leadership matters for successful schools. Evaluating principals on the behaviors shown to improve student learning should be an important leverage point for raising leadership quality. Yet principals are often evaluated with the use of instruments with no theoretical background and little, if any,…
Descriptors: Psychometrics, Instructional Leadership, Principals, Test Construction
Andrich, David – 1984
Both the attenuation paradox of traditional test theory and the assumption of local independence in person-item response theory have caused problems in interpretation. This paper demonstrates that the two are related concepts, and, through this demonstration, both are clarified. It is demonstrated that the breakdown of local independence leads to…
Descriptors: Latent Trait Theory, Test Interpretation, Test Items, Test Reliability
Doolittle, Allen E. – 1983
The stability of selected indices for detecting differential item performance (item bias), from one randomly equivalent sample to another, is addressed. Some recent research has criticized these indices as too unreliable for utility in measuring bias in achievement test items. Using data from a national testing of the ACT Assessment, however, this…
Descriptors: Black Students, Item Analysis, Racial Factors, Reliability

Antonak, Richard F.; Harth, Robert – Mental Retardation, 1994
Psychometric analyses of data from 230 individuals yielded a 29-item 4-scale revision of the original 50-item 5-scale Mental Retardation Attitude Inventory. Results showed adequate item characteristics; adequate reliability and homogeneity; adequate reliability, homogeneity, specificity, and independence of the four scales; and initial validity…
Descriptors: Attitude Measures, Attitudes toward Disabilities, Mental Retardation, Psychometrics
Loyd, Brenda H. – 1984
One form of adaptive testing involves a two-stage procedure. The first stage is the administration of a routing test. From this first test, an estimate of an examinee's ability is obtained. On the basis of this ability estimate, a second test focused on a given ability level is administered. The purpose of this study was to compare the efficiency…
Descriptors: Academic Ability, Adaptive Testing, Difficulty Level, Elementary Education
Hodgin, Robert F. – 1984
Guidelines for the construction and use of an attitude instrument are presented, and the application of the instrument to measure student attitude toward economics is described. Attention is directed to the Likert-like summated forced-choice variety of attitude instrument, whereby attitude toward the object is inferred from the summed responses to…
Descriptors: Attitude Measures, Economics Education, Higher Education, Item Analysis
Reid, Jerry B. – 1985
This report investigates an area of uncertainty in using the Angoff method for setting standards, namely whether or not a judge's conceptualizations of borderline group performance are realistic. Ratings are usually made with reference to the performance of this hypothetical group, therefore the Angoff method's success is dependent on this point.…
Descriptors: Certification, Cutting Scores, Difficulty Level, Interrater Reliability
Haladyna, Thomas M. – 1984
The purpose of this study is to examine an option-weighting method as it affects pass-fail decisions in formative and summative evaluation of student achievement for instructional units, certification, advancement, licensure, admissions, placement, and selection. A database was constructed using high school achievement test data where a…
Descriptors: Achievement Tests, Cutting Scores, High Schools, Multiple Choice Tests
Jackson, Douglas N. – 1983
Concern for enhancing construct validity of vocational interest measures provides a focus for scale construction quite distinct from that derived from a criterion-referenced strategy: Construct-oriented measurement implies: (1) substantive definitions of dimensions; (2) concern for internal consistency reliability, as well as generalizability; (3)…
Descriptors: Career Counseling, Criterion Referenced Tests, Factor Analysis, Interest Inventories

Germann, Paul J. – Journal of Research in Science Teaching, 1989
Describes a paper-and-pencil test for high school biology students measuring science process skills, such as developing hypotheses; making predictions; identifying assumptions; analyzing data; and formulating conclusions. Reports some data on reliability and validity of the test. Provides all 35 items of the test. (YP)
Descriptors: Biology, Science Materials, Science Tests, Secondary Education
Haladyna, Thomas M.; Downing, Steven M. – 1985
In this paper 45 item-writing rules for multiple-choice tests presented in textbooks on educational measurement in a previous study are identified. The current study presents a quantitative review of the literature with respect to the empirical and theoretical evaluation of these principles of item-writing. Fifty-six studies that addressed at…
Descriptors: Educational Research, Elementary Secondary Education, Item Analysis, Multiple Choice Tests
Harvill, Leo M. – 1984
The objectives for this study were to: (1) develop a valid, reliable measure of test-wiseness with equivalent forms for use with students in the health sciences; and (2) determine the level of test-wiseness of entering medical students. The test-wiseness areas included in this study were: similar options, umbrella term, item give-away, convergence…
Descriptors: Higher Education, Measurement Techniques, Medical Students, Multiple Choice Tests
Sax, Gilbert; Reiter, Pauline B. – 1980
Despite the popularity of both multiple-choice (MC) and true-false (TF) items, most investigations comparing the two formats have done so to determine the optimum number of choices to be given to students within a given time period. The purpose of this investigation was to compare the reliabilities and the validities of both formats when the items…
Descriptors: Analysis of Variance, Correlation, Higher Education, Item Analysis
Tomsic, Margie L.; And Others – 1987
Extended caution indices (ECI) specify the degree of confidence that can be placed in an individual's test score by analyzing patterns of item response. Among the most promising of such indices are the standardized ECIs. Contrary to the literature, several instances were found, in a previous study, of nonnormal distributions of ECIs with samples…
Descriptors: Achievement Tests, Elementary Education, Goodness of Fit, Latent Trait Theory