Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 2 |
Since 2006 (last 20 years) | 6 |
Descriptor
Criterion Referenced Tests | 36 |
Difficulty Level | 36 |
Test Items | 36 |
Test Construction | 24 |
Item Analysis | 13 |
Higher Education | 11 |
Test Reliability | 11 |
Test Validity | 10 |
Latent Trait Theory | 8 |
Mathematical Models | 8 |
Multiple Choice Tests | 8 |
More ▼ |
Source
Author
Roid, Gale | 6 |
Haladyna, Tom | 3 |
Hambleton, Ronald K. | 2 |
Al-Habashneh, Maher Hussein | 1 |
Bauer, Ernest A. | 1 |
Beard, Jacob G. | 1 |
Berk, Ronald A. | 1 |
Bernknopf, Stan | 1 |
Brzezinski, Evelyn | 1 |
Combrinck, Celeste | 1 |
Cook, Linda L. | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 3 |
Postsecondary Education | 3 |
Elementary Secondary Education | 1 |
Grade 11 | 1 |
Grade 5 | 1 |
High Schools | 1 |
Secondary Education | 1 |
Audience
Researchers | 4 |
Laws, Policies, & Programs
Assessments and Surveys
Comprehensive Tests of Basic… | 1 |
Michigan Test of English… | 1 |
What Works Clearinghouse Rating
Al-Habashneh, Maher Hussein; Najjar, Nabil Juma – Journal of Education and Practice, 2017
This study aimed at constructing a criterion-reference test to measure the research and statistical competencies of graduate students at the Jordanian governmental universities, the test has to be in its first form of (50) multiple choice items, then the test was introduced to (5) arbitrators with competence in measurement and evaluation to…
Descriptors: Foreign Countries, Criterion Referenced Tests, Graduate Students, Test Construction
Combrinck, Celeste; Scherman, Vanessa; Maree, David – Perspectives in Education, 2016
This study describes how criterion-referenced feedback was produced from English language, mathematics and natural sciences monitoring assessments. The assessments were designed for grades 8 to 11 to give an overall indication of curriculum-standards attained in a given subject over the course of a year (N = 1113). The Rasch Item Map method was…
Descriptors: Item Response Theory, Feedback (Response), Criterion Referenced Tests, Academic Standards
Lieneck, Cristian; Morrison, Eileen; Price, Larry – Current Issues in Education, 2013
The Texas State University-San Marcos undergraduate healthcare administration program requires all bachelors of health administration (BHA) students to pass a comprehensive examination to demonstrate their knowledge of specific core competencies. This also demonstrates completion of their didactic coursework in order to enter a practical…
Descriptors: Exit Examinations, Health Services, Administrator Education, Psychometrics
Wyse, Adam E. – Educational and Psychological Measurement, 2011
Standard setting is a method used to set cut scores on large-scale assessments. One of the most popular standard setting methods is the Bookmark method. In the Bookmark method, panelists are asked to envision a response probability (RP) criterion and move through a booklet of ordered items based on a RP criterion. This study investigates whether…
Descriptors: Testing Programs, Standard Setting (Scoring), Cutting Scores, Probability
Meyers, Jason L.; Murphy, Stephen; Goodman, Joshua; Turhan, Ahmet – Pearson, 2012
Operational testing programs employing item response theory (IRT) applications benefit from of the property of item parameter invariance whereby item parameter estimates obtained from one sample can be applied to other samples (when the underlying assumptions are satisfied). In theory, this feature allows for applications such as computer-adaptive…
Descriptors: Equated Scores, Test Items, Test Format, Item Response Theory
Nehm, Ross H.; Schonfeld, Irvin Sam – Journal of Research in Science Teaching, 2008
Growing recognition of the central importance of fostering an in-depth understanding of natural selection has, surprisingly, failed to stimulate work on the development and rigorous evaluation of instruments that measure knowledge of it. We used three different methodological tools, the Conceptual Inventory of Natural Selection (CINS), a modified…
Descriptors: Evolution, Science Education, Interviews, Measures (Individuals)

Haladyna, Tom; Roid, Gale – Journal of Educational Measurement, 1981
The rationale for use of instructional sensitivity in the empirical review of test items is examined, and the results of a study that distinguishes instructional sensitivity from other item concepts are presented. Research is reviewed which indicates the existence of instructional sensitivity as a unique criterion-referenced test item concept. (RL)
Descriptors: Criterion Referenced Tests, Difficulty Level, Evaluation Criteria, Pretests Posttests
Beard, Jacob G.; And Others – 1984
The purpose of this study was to examine the homogeneity in difficulty of item domains and the effectiveness of Rasch pre-equating procedures for adjusting test scores for differences in the difficulty of tests constructed by sampling from item domains. The data used were taken from a field test and calibration of 810 tenth-grade items in…
Descriptors: Achievement Tests, Criterion Referenced Tests, Difficulty Level, Equated Scores

Roid, G. H.; Haladyna, Thomas M. – Educational and Psychological Measurement, 1978
Two techniques for writing achievement test items to accompany instructional materials are contrasted: writing items from statements of instructional objectives, and writing items from semi-automated rules for transforming instructional statements. Both systems resulted in about the same number of faulty items. (Author/JKS)
Descriptors: Achievement Tests, Comparative Analysis, Criterion Referenced Tests, Difficulty Level
Divgi, D. R. – 1978
One aim of criterion-referenced testing is to classify an examinee without reference to a norm group; therefore, any statements about the dependability of such classification ought to be group-independent also. A population-independent index is proposed in terms of the probability of incorrect classification near the cutoff true score. The…
Descriptors: Criterion Referenced Tests, Cutting Scores, Difficulty Level, Error of Measurement

Tsai, Fu-Ju; Suen, Hoi K. – Educational and Psychological Measurement, 1993
Six methods of scoring multiple true-false items were compared in terms of reliabilities, difficulties, and discrimination. Results suggest that, for norm-referenced score interpretations, there is insufficient evidence to support any one of the methods as superior. For criterion-referenced score interpretations, effects of scoring method must be…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Difficulty Level, Guessing (Tests)
Brzezinski, Evelyn; Demaline, Randy – 1982
Item banks offer a different approach to test development than that of the more traditional item specification method. This study was designed as an initial comparative investigation in the Area. Fourth grade math and reading tests were developed using both items written from item specifications and items drawn from an existing item bank. Four…
Descriptors: Comparative Analysis, Computation, Cost Effectiveness, Criterion Referenced Tests
Berk, Ronald A. – 1978
Sixteen item statistics recommended for use in the development of criterion-referenced tests were evaluated. There were two major criteria: (1) practicability in terms of ease of computation and interpretation and (2) meaningfulness in the context of the development process. Most of the statistics were based on a comparison of performance changes…
Descriptors: Achievement Tests, Criterion Referenced Tests, Difficulty Level, Guides
McCowan, Richard J.; McCowan, Sheila C. – Online Submission, 1999
This paper describes major concepts related to item analysis for criterion-referenced tests including validity, reliability, item difficulty, and item discrimination, particularly in relation to criterion-referenced tests. The paper discussed how these concepts can be used to revise and improve items and listed suggestions regarding general…
Descriptors: Criterion Referenced Tests, Standard Setting, Item Analysis, Item Response Theory
Roid, Gale H.; And Others – 1980
An earlier study was extended and replicated to examine the feasibility of generating multiple-choice test questions by transforming sentences from prose instructional material. In the first study, a computer-based algorithm was used to analyze prose subject matter and to identify high-information words. Sentences containing selected words were…
Descriptors: Algorithms, Computer Assisted Testing, Criterion Referenced Tests, Difficulty Level