NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 190 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024
Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…
Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics
Peer reviewed Peer reviewed
Direct linkDirect link
Tenko Raykov – Educational and Psychological Measurement, 2024
This note is concerned with the benefits that can result from the use of the maximal reliability and optimal linear combination concepts in educational and psychological research. Within the widely used framework of unidimensional multi-component measuring instruments, it is demonstrated that the linear combination of their components that…
Descriptors: Educational Research, Behavioral Science Research, Reliability, Error of Measurement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Metsämuuronen, Jari – Practical Assessment, Research & Evaluation, 2022
The reliability of a test score is usually underestimated and the deflation may be profound, 0.40 - 0.60 units of reliability or 46 - 71%. Eight root sources of the deflation are discussed and quantified by a simulation with 1,440 real-world datasets: (1) errors in the measurement modelling, (2) inefficiency in the estimator of reliability within…
Descriptors: Test Reliability, Scores, Test Items, Correlation
Peer reviewed Peer reviewed
Direct linkDirect link
John Jerrim; Luis Alejandro Lopez-Agudo; Oscar David Marcenaro-Gutierrez – British Journal of Educational Studies, 2024
International large-scale assessments have gained much attention since the beginning of the twenty-first century, influencing education legislation in many countries. This includes Spain, where they have been used by successive governments to justify education policy change. Unfortunately, there was a problem with the PISA 2018 reading scores for…
Descriptors: Foreign Countries, Achievement Tests, International Assessment, Secondary School Students
Peer reviewed Peer reviewed
Direct linkDirect link
Vispoel, Walter P.; Lee, Hyeryung; Xu, Guanlan; Hong, Hyeri – Journal of Experimental Education, 2023
Although generalizability theory (GT) designs have traditionally been analyzed within an ANOVA framework, identical results can be obtained with structural equation models (SEMs) but extended to represent multiple sources of both systematic and measurement error variance, include estimation methods less likely to produce negative variance…
Descriptors: Generalizability Theory, Structural Equation Models, Programming Languages, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Anders Holm; Anders Hjorth-Trolle; Robert Andersen – Sociological Methods & Research, 2025
Lagged dependent variables (LDVs) are often used as predictors in ordinary least squares (OLS) models in the social sciences. Although several estimators are commonly employed, little is known about their relative merits in the presence of classical measurement error and different longitudinal processes. We assess the performance of four commonly…
Descriptors: Elementary Education, Scores, Error of Measurement, Predictor Variables
Peer reviewed Peer reviewed
Direct linkDirect link
Pere J. Ferrando; David Navarro-González; Fabia Morales-Vives – Educational and Psychological Measurement, 2025
The problem of local item dependencies (LIDs) is very common in personality and attitude measures, particularly in those that measure narrow-bandwidth dimensions. At the structural level, these dependencies can be modeled by using extended factor analytic (FA) solutions that include correlated residuals. However, the effects that LIDs have on the…
Descriptors: Scores, Accuracy, Evaluation Methods, Factor Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Irby, Sarah M.; Floyd, Randy G. – Psychology in the Schools, 2017
This study examined the exchangeability of total scores (i.e., intelligent quotients [IQs]) from three brief intelligence tests. Tests were administered to 36 children with intellectual giftedness, scored live by one set of primary examiners and later scored by a secondary examiner. For each student, six IQs were calculated, and all 216 values…
Descriptors: Intelligence Tests, Gifted, Error of Measurement, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Lehmann, Vicky; Hillen, Marij A.; Verdam, Mathilde G. E.; Pieterse, Arwen H.; Labrie, Nanon H. M.; Fruijtier, Agnetha D.; Oreel, Tom H.; Smets, Ellen M. A.; Visser, Leonie N. C. – International Journal of Social Research Methodology, 2023
The Video Engagement Scale (VES) is a quality indicator to assess engagement in experimental video-vignette studies, but its measurement properties warrant improvement. Data from previous studies were combined (N = 2676) and split into three subsamples for a stepped analytical approach. We tested construct validity, criterion validity,…
Descriptors: Likert Scales, Video Technology, Vignettes, Construct Validity
Peer reviewed Peer reviewed
Direct linkDirect link
Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022
The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…
Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency
Peer reviewed Peer reviewed
Direct linkDirect link
Nicewander, W. Alan – Educational and Psychological Measurement, 2019
This inquiry is focused on three indicators of the precision of measurement--conditional on fixed values of ?, the latent variable of item response theory (IRT). The indicators that are compared are (1) The traditional, conditional standard errors, s(eX|?) = CSEM; (2) the IRT-based conditional standard errors, s[subscript irt](eX|?)=C[subscript…
Descriptors: Measurement, Accuracy, Scores, Error of Measurement
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Teker, Gülsen Tasdelen; Güler, Nese – International Journal of Assessment Tools in Education, 2019
One of the important theories in education and psychology is Generalizability (G) Theory and various properties distinguish it from the other measurement theories. To better understand methodological trends of G theory, a thematic content analysis was conducted. This study analyzes the studies using generalizability theory in the field of…
Descriptors: Generalizability Theory, Content Analysis, Foreign Countries, Education
Peer reviewed Peer reviewed
Direct linkDirect link
Bardhoshi, Gerta; Erford, Bradley T. – Measurement and Evaluation in Counseling and Development, 2017
Precision is a key facet of test development, with score reliability determined primarily according to the types of error one wants to approximate and demonstrate. This article identifies and discusses several primary forms of reliability estimation: internal consistency (i.e., split-half, KR-20, a), test-retest, alternate forms, interscorer, and…
Descriptors: Scores, Test Reliability, Accuracy, Pretests Posttests
Peer reviewed Peer reviewed
Direct linkDirect link
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Schmitz, Eva A.; Salemink, Elske; Wiers, Reinout W.; Jansen, Brenda R. J. – Journal of Psychoeducational Assessment, 2022
The Abbreviated Math Anxiety Scale (AMAS) is commonly used to compare groups on math anxiety. Group comparisons should however be preceded by a demonstration of metric and scalar measurement invariance, which is currently only available for undergraduate students in the USA. This study tested for metric and scalar measurement invariance of AMAS…
Descriptors: Foreign Countries, Secondary School Students, College Students, Mathematics Anxiety
Previous Page | Next Page »
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11  |  12  |  13