Publication Date
In 2025 | 1 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 6 |
Since 2016 (last 10 years) | 33 |
Since 2006 (last 20 years) | 232 |
Descriptor
Scores | 363 |
Reliability | 170 |
Test Reliability | 169 |
Test Validity | 89 |
Validity | 80 |
Measures (Individuals) | 75 |
Psychometrics | 64 |
Correlation | 63 |
Foreign Countries | 52 |
Factor Analysis | 48 |
Error of Measurement | 47 |
More ▼ |
Source
Author
Thompson, Bruce | 6 |
Reckase, Mark D. | 5 |
Brennan, Robert L. | 4 |
Lee, Guemin | 4 |
Wainer, Howard | 4 |
Worrell, Frank C. | 4 |
Erford, Bradley T. | 3 |
Petscher, Yaacov | 3 |
Shields, Alan L. | 3 |
Sinharay, Sandip | 3 |
Vacha-Haase, Tammi | 3 |
More ▼ |
Publication Type
Education Level
Higher Education | 32 |
Secondary Education | 22 |
Postsecondary Education | 20 |
Elementary Education | 17 |
Elementary Secondary Education | 15 |
Grade 5 | 14 |
High Schools | 14 |
Grade 4 | 12 |
Grade 8 | 11 |
Grade 6 | 10 |
Grade 3 | 9 |
More ▼ |
Audience
Practitioners | 1 |
Researchers | 1 |
Teachers | 1 |
Location
California | 6 |
Australia | 5 |
United Kingdom (England) | 5 |
Vermont | 5 |
Canada | 4 |
China | 4 |
United Kingdom | 4 |
Netherlands | 3 |
Texas | 3 |
Florida | 2 |
Minnesota | 2 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 3 |
Race to the Top | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 2 |
Meets WWC Standards with or without Reservations | 2 |
John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024
Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…
Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics
Tenko Raykov – Educational and Psychological Measurement, 2024
This note is concerned with the benefits that can result from the use of the maximal reliability and optimal linear combination concepts in educational and psychological research. Within the widely used framework of unidimensional multi-component measuring instruments, it is demonstrated that the linear combination of their components that…
Descriptors: Educational Research, Behavioral Science Research, Reliability, Error of Measurement
Lee, Yi-Hsuan; Haberman, Shelby J. – Journal of Educational Measurement, 2021
For assessments that use different forms in different administrations, equating methods are applied to ensure comparability of scores over time. Ideally, a score scale is well maintained throughout the life of a testing program. In reality, instability of a score scale can result from a variety of causes, some are expected while others may be…
Descriptors: Scores, Regression (Statistics), Demography, Data
Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025
Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…
Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables
Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021
Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…
Descriptors: Rating Scales, Test Construction, Language Tests, Test Use
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2019
This note discusses the merits of coefficient alpha and their conditions in light of recent critical publications that miss out on significant research findings over the past several decades. That earlier research has demonstrated the empirical relevance and utility of coefficient alpha under certain empirical circumstances. The article highlights…
Descriptors: Test Validity, Test Reliability, Test Items, Correlation
Harmston, Matt T.; Camara, Wayne J.; Phillips, Christine K. – ACT, Inc., 2019
Average score change: How big is big? This paper discusses school-level changes in average ACT scores and highlights an interactive tool designed to facilitate score change comparisons.
Descriptors: College Entrance Examinations, High School Students, Scores, Reliability
Alqarni, Abdulelah Mohammed – Journal on Educational Psychology, 2019
This study compares the psychometric properties of reliability in Classical Test Theory (CTT), item information in Item Response Theory (IRT), and validation from the perspective of modern validity theory for the purpose of bringing attention to potential issues that might exist when testing organizations use both test theories in the same testing…
Descriptors: Test Theory, Item Response Theory, Test Construction, Scoring
McGee, Monnie – Journal of Statistics Education, 2019
In several sporting events, the winner is chosen on the basis of a subjective score. These sports include gymnastics, ice skating, and diving. Unlike for other subjectively judged sports, diving competitions consist of multiple rounds in quick succession on the same apparatus. These multiple rounds lead to an extra layer of complexity in the data,…
Descriptors: Data Use, Visualization, Interrater Reliability, Introductory Courses
Christensen, Rhonda; Knezek, Gerald – Journal of Technology Education, 2022
This article describes the development and validation of an Innovation Attitude Survey (IAS) composed of 16 Likert-type items selected to measure middle school students' attitudes toward innovation and leadership in the advancement of new ideas. The goal of developing the IAS was to identify desirable dispositions that may be related to future…
Descriptors: Attitude Measures, Likert Scales, Test Construction, Test Validity
Norris, John; Drackert, Anastasia – Language Testing, 2018
The Test of German as a Foreign Language (TestDaF) plays a critical role as a standardized test of German language proficiency. Developed and administered by the Society for Academic Study Preparation and Test Development (g.a.s.t.), TestDaF was launched in 2001 and has experienced persistent annual growth, with more than 44,000 test takers in…
Descriptors: German, Second Language Learning, Language Tests, Language Proficiency
Aloisi, Cesare; Callaghan, A. – Higher Education Pedagogies, 2018
The University of Reading Learning Gain project is a three-year longitudinal project to test and evaluate a range of available methodologies and to draw conclusions on what might be the right combination of instruments for the measurement of Learning Gain in higher education. This paper analyses the validity of a measure of critical thinking…
Descriptors: Foreign Countries, Cognitive Tests, Critical Thinking, Thinking Skills
Dumas, Denis G.; McNeish, Daniel M. – Educational Researcher, 2018
Dynamic measurement modeling (DMM) has been shown to improve the consequential validity of longitudinal mathematics assessment in the Early Childhood Longitudinal Study-Kindergarten (ECLS-K) database. Here, the authors demonstrate the capability of DMM to similarly improve the consequential validity of ECLS-K reading assessment through the…
Descriptors: Measurement Techniques, Student Evaluation, Alternative Assessment, Evaluation Methods
Green, Samuel B.; Yang, Yanyun – Educational Measurement: Issues and Practice, 2015
In the lead article, Davenport, Davison, Liou, & Love demonstrate the relationship among homogeneity, internal consistency, and coefficient alpha, and also distinguish among them. These distinctions are important because too often coefficient alpha--a reliability coefficient--is interpreted as an index of homogeneity or internal consistency.…
Descriptors: Reliability, Factor Analysis, Computation, Factor Structure
Sultana, Nasreen – Language Testing in Asia, 2018
This paper reviews the most important public English examination (matriculation exam) that students take at the end of their secondary education in Bangladesh. The examination is known as the Secondary School Certificate (SSC), which is taken at the end of Grade 10 in the mainstream education in the country. The score of SSC English examination is…
Descriptors: English (Second Language), Language Tests, Secondary School Students, Scores