Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 7 |
Descriptor
Test Content | 21 |
Test Construction | 10 |
Test Items | 10 |
Performance Based Assessment | 4 |
Scores | 4 |
Test Format | 4 |
Achievement Tests | 3 |
Comparative Analysis | 3 |
Difficulty Level | 3 |
Evaluation Methods | 3 |
Licensing Examinations… | 3 |
More ▼ |
Source
Applied Measurement in… | 21 |
Author
Anne Traynor | 1 |
Barton, Karen E. | 1 |
Behuniak, Peter | 1 |
Carlton, Sydell T. | 1 |
Chang, Lei | 1 |
Crocker, Linda | 1 |
Dadey, Nathan | 1 |
DePascale, Charles | 1 |
Denny, Patricia | 1 |
Ewing, Maureen | 1 |
Feldt, Leonard S. | 1 |
More ▼ |
Publication Type
Journal Articles | 21 |
Reports - Research | 11 |
Reports - Evaluative | 7 |
Information Analyses | 3 |
Reports - Descriptive | 3 |
Book/Product Reviews | 1 |
Speeches/Meeting Papers | 1 |
Education Level
Grade 10 | 1 |
High Schools | 1 |
Secondary Education | 1 |
Audience
Location
Connecticut | 1 |
South Carolina | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Measures of Academic Progress | 1 |
Metropolitan Achievement Tests | 1 |
SAT (College Admission Test) | 1 |
Stanford Achievement Tests | 1 |
Texas Assessment of Academic… | 1 |
What Works Clearinghouse Rating
Anne Traynor; Sara C. Christopherson – Applied Measurement in Education, 2024
Combining methods from earlier content validity and more contemporary content alignment studies may allow a more complete evaluation of the meaning of test scores than if either set of methods is used on its own. This article distinguishes item relevance indices in the content validity literature from test representativeness indices in the…
Descriptors: Test Validity, Test Items, Achievement Tests, Test Construction
Wise, Steven L. – Applied Measurement in Education, 2020
In achievement testing there is typically a practical requirement that the set of items administered should be representative of some target content domain. This is accomplished by establishing test blueprints specifying the content constraints to be followed when selecting the items for a test. Sometimes, however, students give disengaged…
Descriptors: Test Items, Test Content, Achievement Tests, Guessing (Tests)
Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018
Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…
Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education
Keller, Lisa A.; Keller, Robert R. – Applied Measurement in Education, 2015
Equating test forms is an essential activity in standardized testing, with increased importance with the accountability systems in existence through the mandate of Adequate Yearly Progress. It is through equating that scores from different test forms become comparable, which allows for the tracking of changes in the performance of students from…
Descriptors: Item Response Theory, Rating Scales, Standardized Tests, Scoring Rubrics
Ewing, Maureen; Packman, Sheryl; Hamen, Cynthia; Thurber, Allison Clark – Applied Measurement in Education, 2010
In the last few years, the Advanced Placement (AP) Program[R] has used evidence-centered assessment design (ECD) to articulate the knowledge, skills, and abilities to be taught in the course and measured on the summative exam for four science courses, three history courses, and six world language courses; its application to calculus and English…
Descriptors: Advanced Placement Programs, Equivalency Tests, Evidence, Test Construction
Meyers, Jason L.; Miller, G. Edward; Way, Walter D. – Applied Measurement in Education, 2009
In operational testing programs using item response theory (IRT), item parameter invariance is threatened when an item appears in a different location on the live test than it did when it was field tested. This study utilizes data from a large state's assessments to model change in Rasch item difficulty (RID) as a function of item position change,…
Descriptors: Test Items, Test Content, Testing Programs, Simulation

Smisko, Ann; Twing, Jon S.; Denny, Patricia – Applied Measurement in Education, 2000
Describes the Texas test development process in detail, showing how each test development step is linked to the "Standards for Educational and Psychological Testing." The routine use of this process provides evidence of the content and curricular validity of the Texas Assessment of Academic Skills. (SLD)
Descriptors: Achievement Tests, Curriculum, Models, Test Construction

Ferrara, Steven; And Others – Applied Measurement in Education, 1997
Causes of local item dependence in a large-scale performance assessment were studied using data from the Maryland School Performance Assessment Program. Contextual characteristics (content and response requirements) were identified to differentiate locally independent and dependent item clusters. Hypothesized explanations are offered for high…
Descriptors: Context Effect, Performance Based Assessment, Responses, Test Content

Feldt, Leonard S. – Applied Measurement in Education, 2002
Considers the situation in which content or administrative considerations limit the way in which a test can be partitioned to estimate the internal consistency reliability of the total test score. Demonstrates that a single-valued estimate of the total score reliability is possible only if an assumption is made about the comparative size of the…
Descriptors: Error of Measurement, Reliability, Scores, Test Construction

Raymond, Mark R. – Applied Measurement in Education, 2001
Reviews general approaches to job analysis and considers methodological issues related to sampling and the development of rating scales used to measure and describe a profession or occupation. Evaluates the usefulness of different types of test plans and describes judgmental and empirical methods for using practice analysis data to help develop…
Descriptors: Certification, Job Analysis, Licensing Examinations (Professions), Rating Scales

Kingsbury, G. Gage; Zara, Anthony R. – Applied Measurement in Education, 1991
This simulation investigated two procedures that reduce differences between paper-and-pencil testing and computerized adaptive testing (CAT) by making CAT content sensitive. Results indicate that the price in terms of additional test items of using constrained CAT for content balancing is much smaller than that of using testlets. (SLD)
Descriptors: Adaptive Testing, Comparative Analysis, Computer Assisted Testing, Computer Simulation
Huynh, Huynh; Barton, Karen E. – Applied Measurement in Education, 2006
This study examined the effect of oral administration accommodations on test structure and student performance on the Reading test of the South Carolina High School Exit Examination (HSEE). The examination was given at Grade 10 and was untimed; hence, students were permitted as much time as they needed to answer all the questions. Three groups of…
Descriptors: Reading Tests, Exit Examinations, Learning Disabilities, Academic Achievement

Mehrens, William A. – Applied Measurement in Education, 1997
This commentary on articles in this special issue generally agrees with the viewpoints expressed, although it argues that in some cases the authors of these articles should have expanded on certain issues. Many comments relate to the legal defensibility of the positions taken. (SLD)
Descriptors: Certification, Decision Making, Licensing Examinations (Professions), Performance Based Assessment

Stecher, Brian M.; Klein, Stephen P.; Solano-Flores, Guillermo; McCaffrey, Dan; Robyn, Abby; Shavelson, Richard J.; Haertel, Edward – Applied Measurement in Education, 2000
Studied content domain, format, and level of inquiry as factors contributing to the large variation in student performance across open-ended measures. Results for more than 1,200 eighth graders do not support the hypothesis that tasks similar in content, format, and level of inquiry would correlate higher with each other than with measures…
Descriptors: Correlation, Inquiry, Junior High School Students, Junior High Schools

Crocker, Linda – Applied Measurement in Education, 1997
The experience of the National Board for Professional Teaching Standards illustrates how issues of assessing the content representativeness of performance assessment can be addressed to ensure validity for certification procedures. Explores the challenges of collecting validation evidence when expert judgments of content are used. (SLD)
Descriptors: Content Validity, Credentials, Data Collection, Evaluation Methods
Previous Page | Next Page ยป
Pages: 1 | 2