Publication Date
In 2025 | 0 |
Since 2024 | 6 |
Since 2021 (last 5 years) | 21 |
Since 2016 (last 10 years) | 54 |
Since 2006 (last 20 years) | 126 |
Descriptor
Test Construction | 283 |
Test Content | 283 |
Test Items | 118 |
Test Validity | 76 |
Student Evaluation | 54 |
Elementary Secondary Education | 50 |
Test Format | 50 |
Test Reliability | 49 |
Achievement Tests | 39 |
Evaluation Methods | 39 |
Test Use | 38 |
More ▼ |
Source
Author
Kitao, Kenji | 4 |
Kitao, S. Kathleen | 4 |
Sireci, Stephen G. | 4 |
Winnick, Joseph P. | 4 |
Chang, Hua-Hua | 3 |
Ewing, Maureen | 3 |
Hau, Kit-Tai | 3 |
Leung, Chi-Keung | 3 |
Short, Francis X. | 3 |
Thurlow, Martha L. | 3 |
van der Linden, Wim J. | 3 |
More ▼ |
Publication Type
Education Level
Audience
Teachers | 32 |
Practitioners | 27 |
Administrators | 8 |
Students | 7 |
Parents | 4 |
Policymakers | 4 |
Researchers | 2 |
Community | 1 |
Counselors | 1 |
Location
Georgia | 8 |
Illinois | 3 |
United States | 3 |
Australia | 2 |
Germany | 2 |
Iran | 2 |
Japan | 2 |
Kentucky | 2 |
Louisiana | 2 |
Netherlands | 2 |
New Mexico | 2 |
More ▼ |
Laws, Policies, & Programs
Every Student Succeeds Act… | 3 |
No Child Left Behind Act 2001 | 3 |
Individuals with Disabilities… | 2 |
Kentucky Education Reform Act… | 1 |
Rehabilitation Act 1973… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Elliott, Stuart; Chudowsky, Naomi; Plake, Barbara S.; McDonnell, Lorraine – Educational Measurement: Issues and Practice, 2006
In 2001, the U.S. Citizenship and Immigration Services (USCIS) began the process of redesigning the U.S. naturalization tests due to concerns that the current testing procedure may not be sufficiently uniform, and that the test content may not be appropriate. A National Research Council committee issued recommendations, based largely on the…
Descriptors: Citizenship, Immigration, Test Construction, Testing
Wallach, P. M.; Crespo, L. M.; Holtzman, K. Z.; Galbraith, R. M.; Swanson, D. B. – Advances in Health Sciences Education, 2006
Purpose: In conjunction with curricular changes, a process to develop integrated examinations was implemented. Pre-established guidelines were provided favoring vignettes, clinically relevant material, and application of knowledge rather than simple recall. Questions were read aloud in a committee including all course directors, and a reviewer…
Descriptors: Test Items, Rating Scales, Examiners, Guidelines
Content Characteristics of GRE Analytical Reasoning Items. GRE Board Professional Report No. 84-14P.
Chalifour, Clark; Powers, Donald E. – 1988
In actual test development practice, the number of test items that must be developed and pretested is typically greater, and sometimes much greater, than the number eventually judged suitable for use in operational test forms. This has proven to be especially true for analytical reasoning items, which currently form the bulk of the analytical…
Descriptors: Coding, Difficulty Level, Higher Education, Test Construction
Malcolm, Donald J. – 1992
Various memoranda concerning language test development procedures and technical operations are compiled for staff at the Kuwait University Language Center from the Office of Tests and Measurement. The memoranda are of interest to Unit Test Representatives but also are intended to provide guidance to unit supervisors, course coordinators, and…
Descriptors: Foreign Countries, Higher Education, Language Tests, Standards

Feldt, Leonard S. – Applied Measurement in Education, 2002
Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…
Descriptors: Error of Measurement, Reliability, Scores, Test Construction

LeMahieu, Paul G. – NASSP Bulletin, 1992
The value of assessment activities should be judged by their contribution to what happens, directly or indirectly, between teachers and students. No one assessment can serve all ends. Alternative forms of assessment measure outcomes beyond the purview of traditional measures and are more authentic efforts to represent behavior or accomplishments…
Descriptors: Elementary Secondary Education, Evaluation Criteria, Student Evaluation, Test Construction
Alderson, J. Charles; Figueras, Neus; Kuijper, Henk; Nold, Guenter; Takala, Sauli; Tardieu, Claire – Language Assessment Quarterly, 2006
The Common European Framework of Reference (CEFR) is intended as a reference document for language education including assessment. This article describes a project that investigated whether the CEFR can help test developers construct reading and listening tests based on CEFR levels. If the CEFR scales together with the detailed description of…
Descriptors: Test Content, Listening Comprehension Tests, Classification, Test Construction
Childs, Ruth A.; Jaciw, Andrew P. – 2003
This Digest describes matrix sampling of test items as an approach to achieving broad coverage while minimizing testing time per student. Matrix sampling involves developing a complete set of items judged to cover the curriculum, then dividing the items into subsets and administering one subset to each student. Matrix sampling, by limiting the…
Descriptors: Item Banks, Matrices, Sampling, Test Construction
Georgia State Dept. of Education, Atlanta. – 1999
This document contains a description of the Georgia High School Graduation Test in mathematics. The test item specifications, reflecting the Georgia State Quality Core Curriculum, are used by writers and reviewers who are responsible for the development of test items. Much of the content in the description is based on earlier test versions…
Descriptors: High School Students, High Schools, Mathematics, Standardized Tests
Leung, Chi-Keung; Chang, Hua-Hua; Hau, Kit-Tai – 2000
Item selection methods in computerized adaptive testing (CAT) can yield extremely skewed item exposure distribution in which items with high "a" values may be over-exposed while those with low "a" values may never be selected. H. Chang and Z. Ying (1999) proposed the a-stratified design (ASTR) that attempts to equalize item…
Descriptors: Adaptive Testing, Computer Assisted Testing, Selection, Test Construction

Leung, Chi-Keung; Chang, Hua-Hua; Hau, Kit-Tai – Educational and Psychological Measurement, 2003
Studied three stratification designs for computerized adaptive testing in conjunction with three well-developed content balancing methods. Simulation study results show substantial differences in item overlap rate and pool utilization among different methods. Recommends an optimal combination of stratification design and content balancing method.…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Simulation
ACT, Inc., 2007
End-of-course examinations are only as good as the assumptions used in designing them. What is a course's "essential" content? And what does it mean to master it? The end-of-course examinations developed by ACT are derived from assumptions that offer unique and challenging answers to these questions. This brief explains the process used to develop…
Descriptors: Test Content, Course Objectives, Test Construction, Readiness

Douglas, Dan – Applied Language Learning, 1989
Discusses the issues related to the testing of listening comprehension in the context of the 1986 American Council Teaching Foreign Language Proficiency Guidelines. Four issues are singled out: the meaning of context in testing listening; the concept of criterion-referenced tests; the notion of specific purposes in testing; and the use of…
Descriptors: Listening Comprehension, Listening Comprehension Tests, Listening Skills, Test Construction
Council of Chief State School Officers, Washington, DC. – 1992
The Reading Framework for the 1992 National Assessment of Educational Progress (NAEP) contains the rationale for the aspects of reading assessed in 1992 and criteria for development of the assessment. Developed through a national consensus process as a part of an effort to move assessment forward, the framework presented in the booklet is more…
Descriptors: Elementary Secondary Education, Literacy, Reading Skills, Reading Tests
Leung, Chi-Keung; Chang, Hua-Hua; Hau, Kit-Tai – 2001
The multistage alpha-stratified computerized adaptive testing (CAT) design advocated a new philosophy of pool management and item selection using low discriminating items first. It has been demonstrated through simulation studies to be effective both in reducing item overlap rate and enhancing pool utilization with certain pool types. Based on…
Descriptors: Adaptive Testing, Computer Assisted Testing, Item Banks, Selection