Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 28 |
Since 2006 (last 20 years) | 92 |
Descriptor
Evaluation Methods | 136 |
Scores | 136 |
Test Validity | 136 |
Test Reliability | 62 |
Student Evaluation | 38 |
Foreign Countries | 25 |
Test Construction | 23 |
Standardized Tests | 20 |
Correlation | 19 |
Measurement Techniques | 19 |
Psychometrics | 18 |
More ▼ |
Source
Author
Erford, Bradley T. | 2 |
Frazier, Thomas W. | 2 |
Kane, Michael T. | 2 |
McIntyre, Nancy | 2 |
Mundy, Peter | 2 |
Novotny, Stephanie | 2 |
Oswald, Tasha | 2 |
Ryser, Gail R. | 2 |
Swain-Lerro, Lindsey | 2 |
Youngstrom, Eric A. | 2 |
Zajic, Matt | 2 |
More ▼ |
Publication Type
Education Level
Location
Illinois | 3 |
Massachusetts | 3 |
United Kingdom | 3 |
United States | 3 |
Florida | 2 |
Germany | 2 |
Michigan | 2 |
Minnesota | 2 |
North Carolina | 2 |
Texas | 2 |
Washington | 2 |
More ▼ |
Laws, Policies, & Programs
Elementary and Secondary… | 2 |
Comprehensive Education… | 1 |
Elementary and Secondary… | 1 |
Every Student Succeeds Act… | 1 |
Individuals with Disabilities… | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Karen Blackburn Hoeve – ProQuest LLC, 2021
High stakes test-based accountability systems primarily rely on aggregates and derivatives of scores from tests that were originally developed to measure individual student mastery of content specifications. Current validity models do not explicitly address this use of aggregate scores to measure the performance of teachers, administrators, and…
Descriptors: Accountability, Test Validity, High Stakes Tests, Hierarchical Linear Modeling
An, Lily Shiao; Ho, Andrew Dean; Davis, Laurie Laughlin – Educational Measurement: Issues and Practice, 2022
Technical documentation for educational tests focuses primarily on properties of individual scores at single points in time. Reliability, standard errors of measurement, item parameter estimates, fit statistics, and linking constants are standard technical features that external stakeholders use to evaluate items and individual scale scores.…
Descriptors: Documentation, Scores, Evaluation Methods, Longitudinal Studies
Emre Zengin; Yasemin Karal – International Journal of Assessment Tools in Education, 2024
This study was carried out to develop a test to assess algorithmic thinking skills. To this end, the twelve steps suggested by Downing (2006) were adopted. Throughout the test development, 24 middle school sixth-grade students and eight experts in different areas took part as needed in the tasks on the project. The test was given to 252 students…
Descriptors: Grade 6, Algorithms, Thinking Skills, Evaluation Methods
Mattern, Krista; Radunzel, Justine – ACT, Inc., 2019
When applicants take the ACT® more than once, how do colleges and universities reconcile and make sense of the multiple scores? In terms of validity, fairness, and impact on subgroup differences, are certain score-use polices better than others? The focus of this issue brief is to summarize evidence on the validity and fairness of various…
Descriptors: Scoring, College Entrance Examinations, Test Validity, Evaluation Methods
Baraldi Cunha, Andrea; Babik, Iryna; Koziol, Natalie A.; Hsu, Lin-Ya; Nord, Jayden; Harbourne, Regina T.; Westcott-McCoy, Sarah; Dusing, Stacey C.; Bovaird, James A.; Lobo, Michele A. – Grantee Submission, 2021
Purpose: To evaluate the validity, reliability, and sensitivity of the novel Means-End Problem-Solving Assessment Tool (MEPSAT). Methods: Children with typical development and those with motor delay were assessed throughout the first 2 years of life using the MEPSAT. MEPSAT scores were validated against the cognitive and motor subscales of the…
Descriptors: Problem Solving, Early Intervention, Evaluation Methods, Motor Development
Lynch, Sarah – Practical Assessment, Research & Evaluation, 2022
In today's digital age, tests are increasingly being delivered on computers. Many of these computer-based tests (CBTs) have been adapted from paper-based tests (PBTs). However, this change in mode of test administration has the potential to introduce construct-irrelevant variance, affecting the validity of score interpretations. Because of this,…
Descriptors: Computer Assisted Testing, Tests, Scores, Scoring
Interaction, Change, and the Role of the Historical in Validation: The Case of L2 Dynamic Assessment
Poehner, Matthew E.; van Compernolle, Rémi A. – Journal of Cognitive Education and Psychology, 2018
This article examines the implications of argument-based validity for the continued development of dynamic assessment (DA) research and practice. We propose that the move toward validation as a process of interpretation and evidence-based argument is commensurable with DA but that fundamental ontological differences with conventional approaches to…
Descriptors: Alternative Assessment, Evaluation Methods, Second Language Learning, Interaction
Reardon, Sean F.; Ho, Andrew D.; Kalogrides, Demetra – Stanford Center for Education Policy Analysis, 2019
Linking score scales across different tests is considered speculative and fraught, even at the aggregate level (Feuer et al., 1999; Thissen, 2007). We introduce and illustrate validation methods for aggregate linkages, using the challenge of linking U.S. school district average test scores across states as a motivating example. We show that…
Descriptors: Test Validity, Evaluation Methods, School Districts, Scores
Morphew, Jason W.; Mestre, Jose P.; Kang, Hyeon-Ah; Chang, Hua-Hua; Fabry, Gregory – Physical Review Physics Education Research, 2018
Prior research has established that students often underprepare for midterm examinations yet remain overconfident in their proficiency. Research concerning the testing effect has demonstrated that utilizing testing as a study strategy leads to higher performance and more accurate confidence compared to more common study strategies such as…
Descriptors: Computer Assisted Testing, Physics, Science Instruction, Introductory Courses
Wakabayashi, Tomoko; Claxton, Jill; Smith, Everett V., Jr. – Journal of Psychoeducational Assessment, 2019
The Child Observation Record (COR), initially developed in 1993 by HighScope Educational Research Foundation, is an observation-based instrument that provides systematic assessment of young children's knowledge and abilities in all major areas of development. Teachers or caregivers spend a few minutes each day writing brief notes or…
Descriptors: Observation, Evaluation Methods, Early Childhood Education, Kindergarten
Frazier, Thomas W.; Hauschild, Kathryn M.; Klingemier, Eric; Strauss, Mark S.; Hardan, Antonio Y.; Youngstrom, Eric A. – Journal of Intellectual & Developmental Disability, 2020
Background: Language assessment is a key element of evaluations of children and adolescents with neurodevelopmental disorders (NDDs). The present study examined the validity of a gaze-based receptive language index (RLI) in predicting language test results.Method: Participants included toddlers, pre-school, and school age children and adolescents…
Descriptors: Children, Adolescents, Neurological Impairments, Evaluation Methods
Xiao, Yang; Fritchman, Joseph C.; Bao, Jacqueline Y.; Nie, Ying; Han, Jing; Xiong, Jianwen; Xiao, Hua; Bao, Lei – Physical Review Physics Education Research, 2019
In physics education research (PER), concept inventories (CIs) have become standard instruments for assessing students' learning throughout instruction. To promote widespread use of concept inventories, previous studies have developed an approach to split a full length CI into short versions of CIs. This research extends the existing method to…
Descriptors: Physics, Science Instruction, Energy, Magnets
Dumas, Denis G.; McNeish, Daniel M. – Educational Researcher, 2018
Dynamic measurement modeling (DMM) has been shown to improve the consequential validity of longitudinal mathematics assessment in the Early Childhood Longitudinal Study-Kindergarten (ECLS-K) database. Here, the authors demonstrate the capability of DMM to similarly improve the consequential validity of ECLS-K reading assessment through the…
Descriptors: Measurement Techniques, Student Evaluation, Alternative Assessment, Evaluation Methods
Betts, Julian; Hill, Laura; Bachofer, Karen; Hayes, Joseph; Lee, Andrew; Zau, Andrew – Public Policy Institute of California, 2019
This document includes two technical appendices that accompany the main report, "English Learner Trajectories and Reclassification." The two appendices include: (1) Methodology; and (2) Supporting Tables and Figures. [For the full report, see ED603764.]
Descriptors: English Language Learners, Classification, School Districts, Outcomes of Education
Badger, Julia R.; Mellanby, Jane – British Journal of Educational Psychology, 2018
Background: School attainment tests and Cognitive Abilities Tests are used in the United Kingdom to set targets for educational outcome. Whilst these are good predictors, they depend not only on basic ability but also on learnt knowledge and skills, such as reading. Method and Aims: VESPARCH is an online group test of verbal and spatial reasoning,…
Descriptors: Foreign Countries, Intelligence Tests, Verbal Ability, Spatial Ability