ERIC - Search Results

Publication Date

In 2025	2
Since 2024	5
Since 2021 (last 5 years)	17
Since 2016 (last 10 years)	40
Since 2006 (last 20 years)	116

Descriptor

Error of Measurement	190
Scores	190
Reliability	103
Test Reliability	76
Correlation	35
Test Validity	27
Generalizability Theory	26
Psychometrics	25
Foreign Countries	24
Interrater Reliability	24
Measurement Techniques	23
Statistical Analysis	22
Test Items	22
Item Response Theory	19
Measurement	19
Academic Achievement	18
Scoring	18
Test Construction	18
Test Interpretation	18
Validity	18
Factor Analysis	17
Comparative Analysis	16
Computation	16
Test Theory	16
Measures (Individuals)	15
More ▼

Publication Type

Journal Articles	131
Reports - Research	103
Reports - Evaluative	47
Reports - Descriptive	22
Speeches/Meeting Papers	20
Dissertations/Theses -…	6
Opinion Papers	5
Guides - Non-Classroom	4
Numerical/Quantitative Data	4
Tests/Questionnaires	4
Book/Product Reviews	2
ERIC Digests in Full Text	2
ERIC Publications	2
Guides - General	2
Collected Works - Serials	1
Information Analyses	1
More ▼

Education Level

Higher Education	21
Secondary Education	21
Postsecondary Education	18
Elementary Education	14
Middle Schools	14
Junior High Schools	13
High Schools	11
Elementary Secondary Education	7
Grade 5	4
Grade 8	4
Intermediate Grades	4
Grade 10	3
Grade 4	3
Grade 9	3
Kindergarten	3
Grade 11	2
Grade 12	2
Adult Education	1
Early Childhood Education	1
Grade 3	1
Preschool Education	1
Primary Education	1
More ▼

Audience

Researchers	5
Policymakers	1
Practitioners	1
Teachers	1

Location

Canada	5
Netherlands	3
Pennsylvania	3
Spain	3
United States	3
Australia	2
Germany	2
Indonesia	2
Portugal	2
United Kingdom (England)	2
Arkansas	1
California	1
Canada (Toronto)	1
Chile	1
China	1
China (Beijing)	1
Denmark	1
Finland	1
Georgia	1
Jordan	1
Maryland	1
North Carolina	1
Oklahoma	1
South Africa	1
South Korea	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	3
Race to the Top	1

What Works Clearinghouse Rating

Showing 1 to 15 of 190 results Save | Export

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

On the Benefits of Using Maximal Reliability in Educational and Behavioral Research

Peer reviewed

Direct link

Tenko Raykov – Educational and Psychological Measurement, 2024

This note is concerned with the benefits that can result from the use of the maximal reliability and optimal linear combination concepts in educational and psychological research. Within the widely used framework of unidimensional multi-component measuring instruments, it is demonstrated that the linear combination of their components that…

Descriptors: Educational Research, Behavioral Science Research, Reliability, Error of Measurement

How to Obtain the Most Error-Free Estimate of Reliability? Eight Sources of Deflation in the Estimates of Reliability to Avoid

Peer reviewed
PDF on ERIC

Download full text

Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022

The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…

Descriptors: Test Reliability, Scores, Test Items, Correlation

How Did Spain Perform in PISA 2018? New Estimates of Children's PISA Reading Scores

Peer reviewed

Direct link

John Jerrim; Luis Alejandro Lopez-Agudo; Oscar David Marcenaro-Gutierrez – British Journal of Educational Studies, 2024

International large-scale assessments have gained much attention since the beginning of the twenty-first century, influencing education legislation in many countries. This includes Spain, where they have been used by successive governments to justify education policy change. Unfortunately, there was a problem with the PISA 2018 reading scores for…

Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students

Integrating Bifactor Models into a Generalizability Theory Based Structural Equation Modeling Framework

Peer reviewed

Direct link

Vispoel, Walter P.; Lee, Hyeryung; Xu, Guanlan; Hong, Hyeri – Journal of Experimental Education, 2023

Although generalizability theory (GT) designs have traditionally been analyzed within an ANOVA framework, identical results can be obtained with structural equation models (SEMs) but extended to represent multiple sources of both systematic and measurement error variance, include estimation methods less likely to produce negative variance…

Descriptors: Generalizability Theory, Structural Equation Models, Programming Languages, Scores

Lagged Dependent Variable Predictors, Classical Measurement Error, and Path Dependency: The Conditions under Which Various Estimators Are Appropriate

Peer reviewed

Direct link

Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025

Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…

Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables

Linear and Nonlinear Indices of Score Accuracy and Item Effectiveness for Measures That Contain Locally Dependent Items

Peer reviewed

Direct link

Pere J. Ferrando; David Navarro-González; Fabia Morales-Vives – Educational and Psychological Measurement, 2025

The problem of local item dependencies (LIDs) is very common in personality and attitude measures, particularly in those that measure narrow-bandwidth dimensions. At the structural level, these dependencies can be modeled by using extended factor analytic (FA) solutions that include correlated residuals. However, the effects that LIDs have on the…

Descriptors: Scores, Accuracy, Evaluation Methods, Factor Analysis

The Exchangeability of Brief Intelligence Tests for Children with Intellectual Giftedness: Illuminating Error Variance Components' Influence on IQs

Peer reviewed

Direct link

Irby, Sarah M.; Floyd, Randy G. – Psychology in the Schools, 2017

This study examined the exchangeability of total scores (i.e., intelligent quotients [IQs]) from three brief intelligence tests. Tests were administered to 36 children with intellectual giftedness, scored live by one set of primary examiners and later scored by a secondary examiner. For each student, six IQs were calculated, and all 216 values…

Descriptors: Intelligence Tests, Gifted, Error of Measurement, Scores

The Video Engagement Scale (VES): Measurement Properties of the Full and Shortened VES across Studies

Peer reviewed

Direct link

Lehmann, Vicky; Hillen, Marij A.; Verdam, Mathilde G. E.; Pieterse, Arwen H.; Labrie, Nanon H. M.; Fruijtier, Agnetha D.; Oreel, Tom H.; Smets, Ellen M. A.; Visser, Leonie N. C. – International Journal of Social Research Methodology, 2023

The Video Engagement Scale (VES) is a quality indicator to assess engagement in experimental video-vignette studies, but its measurement properties warrant improvement. Data from previous studies were combined (N = 2676) and split into three subsamples for a stepped analytical approach. We tested construct validity, criterion validity,…

Descriptors: Likert Scales, Video Technology, Vignettes, Construct Validity

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

Conditional Precision of Measurement for Test Scores: Are Conditional Standard Errors Sufficient?

Peer reviewed

Direct link

Nicewander, W. Alan – Educational and Psychological Measurement, 2019

This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…

Descriptors: Measurement, Accuracy, Scores, Error of Measurement

Thematic Content Analysis of Studies Using Generalizability Theory

Peer reviewed
PDF on ERIC

Download full text

Teker, Gülsen Tasdelen; Güler, Nese – International Journal of Assessment Tools in Education, 2019

One of the important theories in education and psychology is Generalizability (G) Theory and various properties distinguish it from the other measurement theories. To better understand methodological trends of G theory, a thematic content analysis was conducted. This study analyzes the studies using generalizability theory in the field of…

Descriptors: Generalizability Theory, Content Analysis, Foreign Countries, Education

Processes and Procedures for Estimating Score Reliability and Precision

Peer reviewed

Direct link

Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017

Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…

Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests

Working with Sparse Data in Rated Language Tests: Generalizability Theory Applications

Peer reviewed

Direct link

Lin, Chih-Kai – Language Testing, 2017

Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…

Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy

Test of Measurement Invariance, and Evidence for Reliability and Validity of AMAS Scores in Dutch Secondary School and University Students

Peer reviewed

Direct link

Schmitz, Eva A.; Salemink, Elske; Wiers, Reinout W.; Jansen, Brenda R. J. – Journal of Psychoeducational Assessment, 2022

The Abbreviated Math Anxiety Scale (AMAS) is commonly used to compare groups on math anxiety. Group comparisons should however be preceded by a demonstration of metric and scalar measurement invariance, which is currently only available for undergraduate students in the USA. This study tested for metric and scalar measurement invariance of AMAS…

Descriptors: Foreign Countries, Secondary School Students, College Students, Mathematics Anxiety

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13

Educational and Psychological…	24
ETS Research Report Series	8
Journal of Educational…	8
Applied Psychological…	7
ProQuest LLC	6
Applied Measurement in…	5
Educational Measurement:…	4
International Journal of…	4
Journal of Experimental…	4
Advances in Health Sciences…	3
Educational Testing Service	3
Journal of Psychoeducational…	3
Psychometrika	3
Regional Educational…	3
Society for Research on…	3
ACT, Inc.	2
Assessment & Evaluation in…	2
Assessment for Effective…	2
Developmental Medicine &…	2
Early Education and…	2
GED Testing Service	2
Grantee Submission	2
Journal of Educational and…	2
Language Testing	2
Measurement and Evaluation in…	2
More ▼

Henson, Robin K.	5
Haberman, Shelby J.	4
Kolen, Michael J.	4
McCaffrey, Daniel F.	4
Zimmerman, Donald W.	4
Capraro, Robert M.	3
Lee, Won-Chan	3
Williams, Richard H.	3
Blaker, Lisa	2
Capraro, Mary Margaret	2
Casabianca, Jodi M.	2
Cook, Thomas D.	2
Dedrick, Robert F.	2
Fan, Xitao	2
Floyd, Randy G.	2
Graham, James M.	2
Harris, Deborah J.	2
Ho, Andrew D.	2
Kane, Michael	2
Livingston, Samuel A.	2
Lê, Thanh	2
Najarian, Michelle	2
Nicewander, W. Alan	2
Nord, Christine	2
More ▼

ACT Assessment	4
Advanced Placement…	3
General Educational…	3
Early Childhood Longitudinal…	2
Iowa Tests of Basic Skills	2
National Merit Scholarship…	2
Preliminary Scholastic…	2
SAT (College Admission Test)	2
Test of English as a Foreign…	2
Wechsler Adult Intelligence…	2
Armed Forces Qualification…	1
Beck Depression Inventory	1
Bem Sex Role Inventory	1
Big Five Inventory	1
Cognitive Abilities Test	1
Flesch Kincaid Grade Level…	1
Learning Style Inventory	1
MacArthur Communicative…	1
Mathematics Anxiety Rating…	1
Metropolitan Achievement Tests	1
Motivated Strategies for…	1
Myers Briggs Type Indicator	1
New Jersey College Basic…	1
Praxis Series	1
Program for International…	1
More ▼