ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	29

Descriptor

Reliability	74
Test Reliability	72
Error of Measurement	28
Higher Education	24
Test Items	22
Correlation	21
Estimation (Mathematics)	20
Item Response Theory	20
Test Validity	20
Statistical Analysis	19
Equations (Mathematics)	18
Test Construction	18
Test Theory	18
Mathematical Models	17
Models	17
Scores	16
Rating Scales	15
Simulation	15
Computation	12
Measurement Techniques	12
Psychometrics	12
Scoring	12
Evaluation Methods	11
Sampling	11
Comparative Analysis	10
More ▼

Source

Applied Psychological…

149

Publication Type

Journal Articles	117
Reports - Evaluative	51
Reports - Research	38
Reports - Descriptive	17
Opinion Papers	4
Book/Product Reviews	3
Information Analyses	3
Collected Works - Serials	2
Reports - General	2
Collected Works - General	1
Guides - Non-Classroom	1
Tests/Questionnaires	1
More ▼

Education Level

Higher Education	5
Postsecondary Education	3
Early Childhood Education	1
Elementary Education	1
Grade 2	1
High Schools	1
Primary Education	1
Secondary Education	1

Audience

Location

West Germany	2
Australia	1
Belgium	1
Germany	1
Michigan	1
Netherlands	1
Spain	1
Sweden	1

Laws, Policies, & Programs

What Works Clearinghouse Rating

Applied Psychological Measurement X

Showing 1 to 15 of 149 results Save | Export

The Reliability and Precision of Total Scores and IRT Estimates as a Function of Polytomous IRT Parameters and Latent Trait Distribution

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2013

A classic topic in the fields of psychometrics and measurement has been the impact of the number of scale categories on test score reliability. This study builds on previous research by further articulating the relationship between item response theory (IRT) and classical test theory (CTT). Equations are presented for comparing the reliability and…

Descriptors: Item Response Theory, Reliability, Scores, Error of Measurement

A Latent Class Approach to Estimating Test-Score Reliability

Peer reviewed

Direct link

van der Ark, L. Andries; van der Palm, Daniel W.; Sijtsma, Klaas – Applied Psychological Measurement, 2011

This study presents a general framework for single-administration reliability methods, such as Cronbach's alpha, Guttman's lambda-2, and method MS. This general framework was used to derive a new approach to estimating test-score reliability by means of the unrestricted latent class model. This new approach is the latent class reliability…

Descriptors: Simulation, Reliability, Measurement, Psychology

Comparing the Performance of Five Multidimensional CAT Selection Procedures with Different Stopping Rules

Peer reviewed

Direct link

Yao, Lihua – Applied Psychological Measurement, 2013

Through simulated data, five multidimensional computerized adaptive testing (MCAT) selection procedures with varying test lengths are examined and compared using different stopping rules. Fixed item exposure rates are used for all the items, and the Priority Index (PI) method is used for the content constraints. Two stopping rules, standard error…

Descriptors: Computer Assisted Testing, Adaptive Testing, Test Items, Selection

Comparison of Automated Scoring Methods for a Computerized Performance Assessment of Clinical Judgment

Peer reviewed

Direct link

Harik, Polina; Baldwin, Peter; Clauser, Brian – Applied Psychological Measurement, 2013

Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…

Descriptors: Computer Assisted Testing, Automation, Scoring, Comparative Analysis

Taking the Error Term of the Factor Model into Account: The Factor Score Predictor Interval

Peer reviewed

Direct link

Beauducel, Andre – Applied Psychological Measurement, 2013

The problem of factor score indeterminacy implies that the factor and the error scores cannot be completely disentangled in the factor model. It is therefore proposed to compute Harman's factor score predictor that contains an additive combination of factor and error variance. This additive combination is discussed in the framework of classical…

Descriptors: Factor Analysis, Predictor Variables, Reliability, Error of Measurement

Dynamic Problem Solving: A New Assessment Perspective

Peer reviewed

Direct link

Greiff, Samuel; Wustenberg, Sascha; Funke, Joachim – Applied Psychological Measurement, 2012

This article addresses two unsolved measurement issues in dynamic problem solving (DPS) research: (a) unsystematic construction of DPS tests making a comparison of results obtained in different studies difficult and (b) use of time-intensive single tasks leading to severe reliability problems. To solve these issues, the MicroDYN approach is…

Descriptors: Problem Solving, Tests, Measurement, Structural Equation Models

Evaluating EIV, OLS, and SEM Estimators of Group Slope Differences in the Presence of Measurement Error: The Single-Indicator Case

Peer reviewed

Direct link

Culpepper, Steven Andrew – Applied Psychological Measurement, 2012

Measurement error significantly biases interaction effects and distorts researchers' inferences regarding interactive hypotheses. This article focuses on the single-indicator case and shows how to accurately estimate group slope differences by disattenuating interaction effects with errors-in-variables (EIV) regression. New analytic findings were…

Descriptors: Evidence, Test Length, Interaction, Regression (Statistics)

An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices

Peer reviewed

Direct link

Wyse, Adam E.; Hao, Shiqi – Applied Psychological Measurement, 2012

This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…

Descriptors: Item Response Theory, Classification, Accuracy, Reliability

The Individual Consistency of Acquiescence and Extreme Response Style in Self-Report Questionnaires

Peer reviewed

Direct link

Weijters, Bert; Geuens, Maggie; Schillewaert, Niels – Applied Psychological Measurement, 2010

The severity of bias in respondents' self-reports due to acquiescence response style (ARS) and extreme response style (ERS) depends strongly on how consistent these response styles are over the course of a questionnaire. In the literature, different alternative hypotheses on response style (in)consistency circulate. Therefore, nine alternative…

Descriptors: Models, Response Style (Tests), Questionnaires, Measurement Techniques

Modeling Individual Differences in Numerical Reasoning Speed as a Random Effect of Response Time Limits

Peer reviewed

Direct link

Semmes, Robert; Davison, Mark L.; Close, Catherine – Applied Psychological Measurement, 2011

If numerical reasoning items are administered under time limits, will two dimensions be required to account for the responses, a numerical ability dimension and a speed dimension? A total of 182 college students answered 74 numerical reasoning items. Every item was taken with and without time limits by half the students. Three psychometric models…

Descriptors: Individual Differences, Logical Thinking, Timed Tests, College Students

The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors of Performance Ratings

Peer reviewed

Direct link

Raymond, Mark R.; Harik, Polina; Clauser, Brian E. – Applied Psychological Measurement, 2011

Prior research indicates that the overall reliability of performance ratings can be improved by using ordinary least squares (OLS) regression to adjust for rater effects. The present investigation extends previous work by evaluating the impact of OLS adjustment on standard errors of measurement ("SEM") at specific score levels. In…

Descriptors: Performance Based Assessment, Licensing Examinations (Professions), Least Squares Statistics, Item Response Theory

A Comment on "The "J" Index as a Measure of Nominal Scale Response Agreement"

Peer reviewed

Direct link

Warrens, Matthijs J. – Applied Psychological Measurement, 2009

Various authors have proposed agreement indices for measuring nominal scale response agreement between two judges. Two situations may occur. Either the categories of the nominal scale are defined in advance and both raters use the same categories, or the categories are not defined in advance and the number of categories used by each rater is…

Descriptors: Measures (Individuals), Responses, Interrater Reliability, Equations (Mathematics)

Coefficient Alpha and Reliability of Scale Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Psychological Measurement, 2013

The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…

Descriptors: Raw Scores, Scaling, Reliability, Computation

Detecting Halo Effects in Performance-Based Examinations

Peer reviewed

Direct link

Bechger, Timo M.; Maris, Gunter; Hsiao, Ya Ping – Applied Psychological Measurement, 2010

The main purpose of this article is to demonstrate how halo effects may be detected and quantified using two independent ratings of the same person. A practical illustration is given to show how halo effects can be avoided. (Contains 2 tables, 7 figures, and 2 notes.)

Descriptors: Performance Based Assessment, Test Reliability, Test Length, Language Tests

Classification Consistency and Accuracy for Complex Assessments under the Compound Multinomial Model

Peer reviewed

Direct link

Lee, Won-Chan; Brennan, Robert L.; Wan, Lei – Applied Psychological Measurement, 2009

For a test that consists of dichotomously scored items, several approaches have been reported in the literature for estimating classification consistency and accuracy indices based on a single administration of a test. Classification consistency and accuracy have not been studied much, however, for "complex" assessments--for example,…

Descriptors: Classification, Reliability, Test Items, Scoring

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10

Feldt, Leonard S.	5
Alsawalmeh, Yousef M.	4
Brennan, Robert L.	4
Raykov, Tenko	4
Ferrando, Pere J.	3
Fleiss, Joseph L.	3
Humphreys, Lloyd G.	3
Mellenbergh, Gideon J.	3
Raju, Nambury S.	3
Zimmerman, Donald W.	3
Cicchetti, Domenic V.	2
Culpepper, Steven Andrew	2
Davison, Mark L.	2
Divgi, D. R.	2
Forsyth, Robert A.	2
Harik, Polina	2
Lee, Won-Chan	2
Levin, Joel R.	2
Lindell, Michael K.	2
Lucke, Joseph F.	2
Luecht, Richard M.	2
Meijer, Rob R.	2
Moreland, John R.	2
Nicewander, W. Alan	2
More ▼

Graduate Record Examinations	3
California Psychological…	2
SAT (College Admission Test)	2
ACT Assessment	1
Armed Forces Qualification…	1
Armed Services Vocational…	1
Bem Sex Role Inventory	1
Defining Issues Test	1
Differential Aptitude Test	1
Edwards Personal Preference…	1
Eysenck Personality Inventory	1
Hidden Figures Test	1
Minnesota Importance…	1
Minnesota Multiphasic…	1
Rod and Frame Test	1
Sixteen Personality Factor…	1
Stanford Binet Intelligence…	1
Strong Campbell Interest…	1
United States Medical…	1
Washington University…	1
Wechsler Intelligence Scale…	1
Wechsler Preschool and…	1
More ▼