ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	10

Source

Applied Psychological…

Publication Type

Journal Articles	38
Reports - Research	38
Information Analyses	1
Opinion Papers	1
Reports - Evaluative	1
Tests/Questionnaires	1

Education Level

Higher Education	4
Postsecondary Education	2
Early Childhood Education	1
Elementary Education	1
Grade 2	1
High Schools	1
Primary Education	1
Secondary Education	1

Audience

Location

Australia	1
Belgium	1
Michigan	1
Netherlands	1
Spain	1
Sweden	1

Laws, Policies, & Programs

Assessments and Surveys

ACT Assessment	1
Differential Aptitude Test	1
Graduate Record Examinations	1
SAT (College Admission Test)	1
Stanford Binet Intelligence…	1
Strong Campbell Interest…	1
United States Medical…	1
Wechsler Intelligence Scale…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 38 results Save | Export

Comparison of Automated Scoring Methods for a Computerized Performance Assessment of Clinical Judgment

Peer reviewed

Direct link

Harik, Polina; Baldwin, Peter; Clauser, Brian – Applied Psychological Measurement, 2013

Growing reliance on complex constructed response items has generated considerable interest in automated scoring solutions. Many of these solutions are described in the literature; however, relatively few studies have been published that "compare" automated scoring strategies. Here, comparisons are made among five strategies for…

Descriptors: Computer Assisted Testing, Automation, Scoring, Comparative Analysis

An Evaluation of Item Response Theory Classification Accuracy and Consistency Indices

Peer reviewed

Direct link

Wyse, Adam E.; Hao, Shiqi – Applied Psychological Measurement, 2012

This article introduces two new classification consistency indices that can be used when item response theory (IRT) models have been applied. The new indices are shown to be related to Rudner's classification accuracy index and Guo's classification accuracy index. The Rudner- and Guo-based classification accuracy and consistency indices are…

Descriptors: Item Response Theory, Classification, Accuracy, Reliability

The Individual Consistency of Acquiescence and Extreme Response Style in Self-Report Questionnaires

Peer reviewed

Direct link

Weijters, Bert; Geuens, Maggie; Schillewaert, Niels – Applied Psychological Measurement, 2010

The severity of bias in respondents' self-reports due to acquiescence response style (ARS) and extreme response style (ERS) depends strongly on how consistent these response styles are over the course of a questionnaire. In the literature, different alternative hypotheses on response style (in)consistency circulate. Therefore, nine alternative…

Descriptors: Models, Response Style (Tests), Questionnaires, Measurement Techniques

Modeling Individual Differences in Numerical Reasoning Speed as a Random Effect of Response Time Limits

Peer reviewed

Direct link

Semmes, Robert; Davison, Mark L.; Close, Catherine – Applied Psychological Measurement, 2011

If numerical reasoning items are administered under time limits, will two dimensions be required to account for the responses, a numerical ability dimension and a speed dimension? A total of 182 college students answered 74 numerical reasoning items. Every item was taken with and without time limits by half the students. Three psychometric models…

Descriptors: Individual Differences, Logical Thinking, Timed Tests, College Students

The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors of Performance Ratings

Peer reviewed

Direct link

Raymond, Mark R.; Harik, Polina; Clauser, Brian E. – Applied Psychological Measurement, 2011

Prior research indicates that the overall reliability of performance ratings can be improved by using ordinary least squares (OLS) regression to adjust for rater effects. The present investigation extends previous work by evaluating the impact of OLS adjustment on standard errors of measurement ("SEM") at specific score levels. In…

Descriptors: Performance Based Assessment, Licensing Examinations (Professions), Least Squares Statistics, Item Response Theory

Coefficient Alpha and Reliability of Scale Scores

Peer reviewed

Direct link

Almehrizi, Rashid S. – Applied Psychological Measurement, 2013

The majority of large-scale assessments develop various score scales that are either linear or nonlinear transformations of raw scores for better interpretations and uses of assessment results. The current formula for coefficient alpha (a; the commonly used reliability coefficient) only provides internal consistency reliability estimates of raw…

Descriptors: Raw Scores, Scaling, Reliability, Computation

Detecting Halo Effects in Performance-Based Examinations

Peer reviewed

Direct link

Bechger, Timo M.; Maris, Gunter; Hsiao, Ya Ping – Applied Psychological Measurement, 2010

The main purpose of this article is to demonstrate how halo effects may be detected and quantified using two independent ratings of the same person. A practical illustration is given to show how halo effects can be avoided. (Contains 2 tables, 7 figures, and 2 notes.)

Descriptors: Performance Based Assessment, Test Reliability, Test Length, Language Tests

Coping with Memory Effect and Serial Correlation when Estimating Reliability in a Longitudinal Framework

Peer reviewed

Direct link

Laenen, Annouschka; Alonso, Ariel; Molenberghs, Geert; Vangeneugden, Tony; Mallinckrodt, Craig H. – Applied Psychological Measurement, 2010

Longitudinal studies are permeating clinical trials in psychiatry. Therefore, it is of utmost importance to study the psychometric properties of rating scales, frequently used in these trials, within a longitudinal framework. However, intrasubject serial correlation and memory effects are problematic issues often encountered in longitudinal data.…

Descriptors: Psychiatry, Rating Scales, Memory, Psychometrics

Exploring Population Sensitivity of Linking Functions across Three Law School Admission Test Administrations

Peer reviewed

Direct link

Liu, Mei; Holland, Paul W. – Applied Psychological Measurement, 2008

The simplified version of the Dorans and Holland (2000) measure of population invariance, the root mean square difference (RMSD), is used to explore the degree of dependence of linking functions on the Law School Admission Test (LSAT) subpopulations defined by examinees' gender, ethnic background, geographic region, law school application status,…

Descriptors: Law Schools, Equated Scores, Geographic Regions, Geometric Concepts

Cognitive Diagnostic Attribute-Level Discrimination Indices

Peer reviewed

Direct link

Henson, Robert; Roussos, Louis; Douglas, Jeff; He, Xuming – Applied Psychological Measurement, 2008

Cognitive diagnostic models (CDMs) model the probability of correctly answering an item as a function of an examinee's attribute mastery pattern. Because estimation of the mastery pattern involves more than a continuous measure of ability, reliability concepts introduced by classical test theory and item response theory do not apply. The cognitive…

Descriptors: Diagnostic Tests, Classification, Probability, Item Response Theory

A Zero-One Programming Approach to Gulliksen's Matched Random Subtests Method.

Peer reviewed

van der Linden, Wim J.; Boekkooi-Timminga, Ellen – Applied Psychological Measurement, 1988

Gulliksen's matched random subtests method is a graphical method to split a test into parallel test halves, allowing maximization of coefficient alpha as a lower bound to the classical test reliability coefficient. This problem is formulated as a zero-one programing problem solvable by algorithms that already exist. (TJH)

Descriptors: Algorithms, Equations (Mathematics), Programing, Test Reliability

Longitudinal Stability of Person Characteristics: Intelligence and Creativity.

Peer reviewed

Magnusson, D.; Backteman, G. – Applied Psychological Measurement, 1979

A longitudinal study of approximately 1,000 students aged 10-16 showed high stability of intelligence and creativity. Stability coefficients for intelligence were higher than those for creativity. Results supported the construct validity of creativity. (MH)

Descriptors: Creativity, Creativity Tests, Elementary Secondary Education, Foreign Countries

Group Dependence of Some Reliability Indices for Mastery Tests.

Peer reviewed

Divgi, D. R. – Applied Psychological Measurement, 1980

The dependence of reliability indices for mastery tests on mean and cutoff scores was examined in the case of three decision-theoretic indices. Dependence of kappa on mean and cutoff scores was opposite to that of the proportion of correct decisions, which was linearly related to average threshold loss. (Author/BW)

Descriptors: Classification, Cutting Scores, Mastery Tests, Test Reliability

Individual Inconsistency: Implications for Test Reliability and Behavioral Predictability.

Peer reviewed

Whitely, Susan E. – Applied Psychological Measurement, 1979

Two sources of inconsistency were separated by reanalyzing data from a major study on short-term consistency. Little evidence was found for generalizability or behavioral predictability. Results supported the assumption that measurement error from short-term fluctuations is not due to systematic individual differences in response consistency.…

Descriptors: Behavior Change, Cognitive Processes, College Freshmen, Error of Measurement

A Comparison of the Nedelsky and Angoff Cutting Score Procedures Using Generalizability Theory.

Peer reviewed

Brennan, Robert L.; Lockwood, Robert E. – Applied Psychological Measurement, 1980

Generalizability theory is used to characterize and quantify expected variance in cutting scores and to compare the Nedelsky and Angoff procedures for establishing a cutting score. Results suggest that the restricted nature of the Nedelsky (inferred) probability scale may limit its applicability in certain contexts. (Author/BW)

Descriptors: Cutting Scores, Generalization, Statistical Analysis, Test Reliability

Previous Page | Next Page »

Pages: 1 | 2 | 3

Test Reliability	23
Reliability	16
Higher Education	9
Test Items	8
Models	7
Test Validity	7
Correlation	6
Foreign Countries	6
Test Construction	6
Classification	5
Comparative Analysis	5
Elementary Secondary Education	5
Error of Measurement	5
Item Response Theory	5
Rating Scales	5
Test Format	5
Intelligence Tests	4
Statistical Analysis	4
Technical Reports	4
Computation	3
Computer Assisted Testing	3
Cutting Scores	3
Individual Differences	3
Measurement Techniques	3
Performance Based Assessment	3
More ▼

Harik, Polina	2
van der Linden, Wim J.	2
Almehrizi, Rashid S.	1
Alonso, Ariel	1
Backteman, G.	1
Baldwin, Peter	1
Barnes, Janet L.	1
Bechger, Timo M.	1
Boekkooi-Timminga, Ellen	1
Bray, James H.	1
Brennan, Robert L.	1
Budescu, David V.	1
Chang, Lei	1
Clauser, Brian	1
Clauser, Brian E.	1
Close, Catherine	1
Conger, Anthony J.	1
Cudeck, Robert	1
Cuzick, Jack	1
Davison, Mark L.	1
De Lisi, Richard	1
Divgi, D. R.	1
Douglas, Jeff	1
Downey, Ronald G.	1
More ▼