ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	19

Descriptor

Reliability	74
Correlation	17
Error of Measurement	17
Estimation (Mathematics)	14
Item Response Theory	12
Test Theory	12
Models	11
Simulation	11
Statistical Analysis	11
Computation	10
Sampling	9
Test Items	9
Equations (Mathematics)	8
Measurement Techniques	8
Monte Carlo Methods	8
Scores	8
Comparative Analysis	7
Mathematical Models	7
Classification	6
Evaluation Methods	6
Hypothesis Testing	6
Maximum Likelihood Statistics	6
Structural Equation Models	6
Achievement Gains	5
Analysis of Variance	5
More ▼

Source

Applied Psychological…

Publication Type

Journal Articles	64
Reports - Evaluative	31
Reports - Research	16
Reports - Descriptive	12
Book/Product Reviews	3
Guides - Non-Classroom	1
Information Analyses	1

Education Level

Higher Education	3
Postsecondary Education	2
Early Childhood Education	1
Elementary Education	1
Grade 2	1
High Schools	1
Primary Education	1
Secondary Education	1

Audience

Location

Belgium	1
Germany	1
Michigan	1
Spain	1
Sweden	1

Laws, Policies, & Programs

Assessments and Surveys

SAT (College Admission Test)	2
ACT Assessment	1
California Psychological…	1
Differential Aptitude Test	1
Edwards Personal Preference…	1
Eysenck Personality Inventory	1
Graduate Record Examinations	1
Minnesota Multiphasic…	1
United States Medical…	1
Wechsler Preschool and…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 74 results Save | Export

The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2013

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement

A Latent Class Approach to Estimating Test-Score Reliability

Peer reviewed

Direct link

van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas – Applied Psychological Measurement, 2011

This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…

Descriptors: Simulation, Reliability, Measurement, Psychology

Comparison of Automated Scoring Methods for a Computerized Performance Assessment of Clinical Judgment

Peer reviewed

Direct link

Harik, Polina; Baldwin, Peter; Clauser, Brian – Applied Psychological Measurement, 2013

Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…

Descriptors: Computer Assisted Testing, Automation, Scoring, Comparative Analysis

Taking the Error Term of the Factor Model into Account: The Factor Score Predictor Interval

Peer reviewed

Direct link

Beauducel, Andre – Applied Psychological Measurement, 2013

The problem of factor score indeterminacy implies that the factor and the error scores cannot be completely disentangled in the factor model. It is therefore proposed to compute Harman's factor score predictor that contains an additive combination of factor and error variance. This additive combination is discussed in the framework of classical…

Descriptors: Factor Analysis, Predictor Variables, Reliability, Error of Measurement

Evaluating EIV, OLS, and SEM Estimators of Group Slope Differences in the Presence of Measurement Error: The Single-Indicator Case

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2012

Measurement error significantly biases interaction effects and distorts researchers' inferences regarding interactive hypotheses. This article focuses on the single-indicator case and shows how to accurately estimate group slope differences by disattenuating interaction effects with errors-in-variables (EIV) regression. New analytic findings were…

Descriptors: Evidence, Test Length, Interaction, Regression (Statistics)

An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices

Peer reviewed

Direct link

Wyse, Adam E.; Hao, Shiqi – Applied Psychological Measurement, 2012

This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…

Descriptors: Item Response Theory, Classification, Accuracy, Reliability

The Individual Consistency of Acquiescence and Extreme Response Style in Self-Report Questionnaires

Peer reviewed

Direct link

Weijters, Bert; Geuens, Maggie; Schillewaert, Niels – Applied Psychological Measurement, 2010

The severity of bias in respondents' self-reports due to acquiescence response style (ARS) and extreme response style (ERS) depends strongly on how consistent these response styles are over the course of a questionnaire. In the literature, different alternative hypotheses on response style (in)consistency circulate. Therefore, nine alternative…

Descriptors: Models, Response Style (Tests), Questionnaires, Measurement Techniques

Modeling Individual Differences in Numerical Reasoning Speed as a Random Effect of Response Time Limits

Peer reviewed

Direct link

Semmes, Robert; Davison, Mark L.; Close, Catherine – Applied Psychological Measurement, 2011

If numerical reasoning items are administered under time limits, will two dimensions be required to account for the responses, a numerical ability dimension and a speed dimension? A total of 182 college students answered 74 numerical reasoning items. Every item was taken with and without time limits by half the students. Three psychometric models…

Descriptors: Individual Differences, Logical Thinking, Timed Tests, College Students

Coefficient Alpha and Reliability of Scale Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Psychological Measurement, 2013

The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…

Descriptors: Raw Scores, Scaling, Reliability, Computation

Classification Consistency and Accuracy for Complex Assessments under the Compound Multinomial Model

Peer reviewed

Direct link

Lee, Won-Chan; Brennan, Robert L.; Wan, Lei – Applied Psychological Measurement, 2009

For a test that consists of dichotomously scored items, several approaches have been reported in the literature for estimating classification consistency and accuracy indices based on a single administration of a test. Classification consistency and accuracy have not been studied much, however, for "complex" assessments--for example,…

Descriptors: Classification, Reliability, Test Items, Scoring

Coping with Memory Effect and Serial Correlation when Estimating Reliability in a Longitudinal Framework

Peer reviewed

Direct link

Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert; Vangeneugden, Tony; Mallinckrodt, Craig H. – Applied Psychological Measurement, 2010

Longitudinal studies are permeating clinical trials in psychiatry. Therefore, it is of utmost importance to study the psychometric properties of rating scales, frequently used in these trials, within a longitudinal framework. However, intrasubject serial correlation and memory effects are problematic issues often encountered in longitudinal data.…

Descriptors: Psychiatry, Rating Scales, Memory, Psychometrics

Commingled Samples: A Neglected Source of Bias in Reliability Analysis

Peer reviewed

Direct link

Waller, Niels G. – Applied Psychological Measurement, 2008

Reliability is a property of test scores from individuals who have been sampled from a well-defined population. Reliability indices, such as coefficient and related formulas for internal consistency reliability (KR-20, Hoyt's reliability), yield lower bound reliability estimates when (a) subjects have been sampled from a single population and when…

Descriptors: Test Items, Reliability, Scores, Psychometrics

Exploring Population Sensitivity of Linking Functions across Three Law School Admission Test Administrations

Peer reviewed

Direct link

Liu, Mei; Holland, Paul W. – Applied Psychological Measurement, 2008

The simplified version of the Dorans and Holland (2000) measure of population invariance, the root mean square difference (RMSD), is used to explore the degree of dependence of linking functions on the Law School Admission Test (LSAT) subpopulations defined by examinees' gender, ethnic background, geographic region, law school application status,…

Descriptors: Law Schools, Equated Scores, Geographic Regions, Geometric Concepts

Comparison of the Null Distributions of Weighted Kappa and the C Ordinal Statistic

Peer reviewed

Cicchetti, Domenic V.; Fleiss, Joseph L. – Applied Psychological Measurement, 1977

The weighted kappa coefficient is a measure of interrater agreement when the relative seriousness of each possible disagreement can be quantified. This monte carlo study demonstrates the utility of the kappa coefficient for ordinal data. Sample size is also briefly discussed. (Author/JKS)

Descriptors: Mathematical Models, Rating Scales, Reliability, Sampling

Cognitive Diagnostic Attribute-Level Discrimination Indices

Peer reviewed

Direct link

Henson, Robert; Roussos, Louis; Douglas, Jeff; He, Xuming – Applied Psychological Measurement, 2008

Cognitive diagnostic models (CDMs) model the probability of correctly answering an item as a function of an examinee's attribute mastery pattern. Because estimation of the mastery pattern involves more than a continuous measure of ability, reliability concepts introduced by classical test theory and item response theory do not apply. The cognitive…

Descriptors: Diagnostic Tests, Classification, Probability, Item Response Theory

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

Feldt, Leonard S.	5
Alsawalmeh, Yousef M.	4
Raykov, Tenko	4
Brennan, Robert L.	3
Ferrando, Pere J.	3
Fleiss, Joseph L.	3
Raju, Nambury S.	3
Cicchetti, Domenic V.	2
Culpepper, Steven Andrew	2
Forsyth, Robert A.	2
Mellenbergh, Gideon J.	2
Nicewander, W. Alan	2
Oshima, T. C.	2
Williams, Richard H.	2
Zimmerman, Donald W.	2
Almehrizi, Rashid S.	1
Alonso, Ariel	1
Ankenmann, Robert D.	1
Attali, Yigal	1
Backteman, G.	1
Baldwin, Peter	1
Beauducel, Andre	1
Bonnett, Douglas G.	1
Bost, James E.	1
More ▼