ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	126

Descriptor

Statistical Analysis	187
Reliability	102
Test Reliability	57
Validity	37
Interrater Reliability	36
Foreign Countries	30
Measures (Individuals)	30
Evaluation Methods	29
Scores	28
Test Validity	26
Test Construction	25
Correlation	24
Factor Analysis	23
Psychometrics	23
Research Methodology	22
Comparative Analysis	21
Models	19
Questionnaires	19
Error of Measurement	13
Measurement Techniques	13
Test Items	13
Evaluation Research	12
Item Response Theory	12
Reading Comprehension	12
Sample Size	12
More ▼

Publication Type

Reports - Evaluative	187
Journal Articles	141
Speeches/Meeting Papers	23
Numerical/Quantitative Data	9
Information Analyses	5
Opinion Papers	4
Tests/Questionnaires	3
Guides - Non-Classroom	2
Reports - Research	1

Education Level

Elementary Education	17
Elementary Secondary Education	17
Higher Education	17
Secondary Education	12
High Schools	10
Postsecondary Education	9
Middle Schools	8
Grade 7	7
Grade 8	6
Grade 3	5
Grade 4	5
Grade 6	5
Early Childhood Education	4
Grade 1	4
Grade 2	4
Grade 5	4
Kindergarten	4
Grade 9	3
Junior High Schools	2
Primary Education	2
Adult Education	1
Grade 11	1
Intermediate Grades	1
Preschool Education	1
More ▼

Audience

Researchers	4
Practitioners	3
Administrators	1
Counselors	1
Teachers	1

Location

United Kingdom	5
Australia	4
California	3
Michigan	3
Taiwan	3
Canada	2
Florida	2
Germany	2
Greece	2
Idaho	2
Illinois	2
Minnesota	2
Netherlands	2
Ohio	2
Pennsylvania	2
Turkey	2
Arizona	1
Arkansas	1
Canada (Toronto)	1
China	1
China (Guangzhou)	1
Colorado	1
Connecticut	1
Delaware	1
District of Columbia	1
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	2
Debra P v Turlington	1
Race to the Top	1
Reading Excellence Act	1
Safe and Drug Free Schools…	1

What Works Clearinghouse Rating

Meets WWC Standards without Reservations	1
Meets WWC Standards with or without Reservations	1

Showing 1 to 15 of 187 results Save | Export

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

Assessing the Consistency Assumptions Underlying Network Meta-Regression Using Aggregate Data

Peer reviewed

Direct link

Donegan, Sarah; Dias, Sofia; Welton, Nicky J. – Research Synthesis Methods, 2019

When numerous treatments exist for a disease (Treatments 1, 2, 3, etc), network meta-regression (NMR) examines whether each relative treatment effect (eg, mean difference for 2 vs 1, 3 vs 1, and 3 vs 2) differs according to a covariate (eg, disease severity). Two consistency assumptions underlie NMR: consistency of the treatment effects at the…

Descriptors: Reliability, Regression (Statistics), Outcomes of Treatment, Statistical Analysis

Multiple-Component Measurement Instruments in Heterogeneous Populations: Is There a Single Coefficient Alpha?

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A.; Harrison, Michael; Menold, Natalja – Educational and Psychological Measurement, 2019

This note confronts the common use of a single coefficient alpha as an index informing about reliability of a multicomponent measurement instrument in a heterogeneous population. Two or more alpha coefficients could instead be meaningfully associated with a given instrument in finite mixture settings, and this may be increasingly more likely the…

Descriptors: Statistical Analysis, Test Reliability, Measures (Individuals), Computation

An Unbiased Estimate of Global Interrater Agreement

Peer reviewed

Direct link

Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2017

Assessing global interrater agreement is difficult as most published indices are affected by the presence of mixtures of agreements and disagreements. A previously proposed method was shown to be specifically sensitive to global agreement, excluding mixtures, but also negatively biased. Here, we propose two alternatives in an attempt to find what…

Descriptors: Interrater Reliability, Evaluation Methods, Statistical Bias, Accuracy

Loosening Psychometric Constraints on Educational Assessments

Peer reviewed

Direct link

Kane, Michael T. – Assessment in Education: Principles, Policy & Practice, 2017

In response to an argument by Baird, Andrich, Hopfenbeck and Stobart (2017), Michael Kane states that there needs to be a better fit between educational assessment and learning theory. In line with this goal, Kane will examine how psychometric constraints might be loosened by relaxing some psychometric "rules" in some assessment…

Descriptors: Educational Assessment, Psychometrics, Standards, Test Reliability

A Ratio Test of Interrater Agreement with High Specificity

Peer reviewed

Direct link

Cousineau, Denis; Laurencelle, Louis – Educational and Psychological Measurement, 2015

Existing tests of interrater agreements have high statistical power; however, they lack specificity. If the ratings of the two raters do not show agreement but are not random, the current tests, some of which are based on Cohen's kappa, will often reject the null hypothesis, leading to the wrong conclusion that agreement is present. A new test of…

Descriptors: Interrater Reliability, Monte Carlo Methods, Measurement Techniques, Accuracy

The Reliability of Setting Grade Boundaries Using Comparative Judgement

Peer reviewed

Direct link

Benton, Tom; Elliott, Gill – Research Papers in Education, 2016

In recent years the use of expert judgement to set and maintain examination standards has been increasingly criticised in favour of approaches based on statistical modelling. This paper reviews existing research on this controversy and attempts to unify the evidence within a framework where expertise is utilised in the form of comparative…

Descriptors: Reliability, Expertise, Mathematical Models, Standard Setting (Scoring)

The "Test of Financial Literacy": Development and Measurement Characteristics

Peer reviewed

Direct link

Walstad, William B.; Rebeck, Ken – Journal of Economic Education, 2017

The "Test of Financial Literacy" (TFL) was created to measure the financial knowledge of high school students. Its content is based on the standards and benchmarks stated in the "National Standards for Financial Literacy" (Council for Economic Education 2013). The test development process involved extensive item writing and…

Descriptors: Tests, Money Management, Literacy, High School Students

Interrater Agreement Evaluation: A Latent Variable Modeling Approach

Peer reviewed

Direct link

Raykov, Tenko; Dimitrov, Dimiter M.; von Eye, Alexander; Marcoulides, George A. – Educational and Psychological Measurement, 2013

A latent variable modeling method for evaluation of interrater agreement is outlined. The procedure is useful for point and interval estimation of the degree of agreement among a given set of judges evaluating a group of targets. In addition, the approach allows one to test for identity in underlying thresholds across raters as well as to identify…

Descriptors: Interrater Reliability, Models, Statistical Analysis, Computation

Development, Use and Implications of Diagnostic Creativity Assessment App, RDCA--Reisman Diagnostic Creativity Assessment

Peer reviewed

Direct link

Reisman, Fredricka; Keiser, Larry; Otti, Obinna – Creativity Research Journal, 2016

The Reisman Diagnostic Creativity Assessment (RDCA) is a free online self-report creativity assessment that provides immediate feedback to the user and is diagnostic, rather than predictive, with the focus on making the user aware of creative strengths and weaknesses. Several engineering and teacher education studies have included the RDCA over a…

Descriptors: Creativity, Creativity Tests, Creative Development, Computer Oriented Programs

Cell Phone and Face-to-Face Interview Responses in Population-Based Surveys: How Do They Compare?

Peer reviewed

Direct link

Mahfoud, Ziyad; Ghandour, Lilian; Ghandour, Blanche; Mokdad, Ali H.; Sibai, Abla M. – Field Methods, 2015

Findings on the reliability and cost-effectiveness of the use of cellular phones vis-à-vis face-to-face interviews in investigating health behaviors and conditions are presented for a national epidemiological sample from Lebanon. Using self-reported responses on identical questions, percentage agreement, ? statistics, and McNemar's test were used…

Descriptors: Telephone Surveys, Interviews, Surveys, Responses

Formative versus Reflective Measurement of Executive Function Tasks: Response to Commentaries and Another Perspective

Peer reviewed
PDF on ERIC

Download full text

Direct link

Willoughby, Michael T. – Grantee Submission, 2014

The focus article (Willoughby et al., 2014) (1) introduced the distinction between formative and reflective measurement and (2) proposed that performance-based executive function tasks may be better conceptualized from the perspective of formative rather than reflective measurement. This proposal stands in sharp contrast to conventional…

Descriptors: Executive Function, Formative Evaluation, Cognitive Measurement, Factor Analysis

Reformers, Batting Averages, and Malpractice: The Case for Caution in Value-Added Use

Peer reviewed

Direct link

Gleason, Daniel – Educational Forum, 2014

The essay considers two analogies that help to reveal the limitations of value-added modeling: the first, a comparison with batting averages, shows that the model's reliability is quite limited even though year-to-year correlation figures may seem impressive; the second, a comparison between medical malpractice and so-called educational…

Descriptors: Models, Evaluation Methods, Reliability, Correlation

Difference Scores from the Point of View of Reliability and Repeated-Measures ANOVA: In Defense of Difference Scores for Data Analysis

Peer reviewed

Direct link

Thomas, D. Roland; Zumbo, Bruno D. – Educational and Psychological Measurement, 2012

There is such doubt in research practice about the reliability of difference scores that granting agencies, journal editors, reviewers, and committees of graduate students' theses have been known to deplore their use. This most maligned index can be used in studies of change, growth, or perhaps discrepancy between two measures taken on the same…

Descriptors: Statistical Analysis, Reliability, Scores, Change

Exploring Students' Conceptions of Science Learning via Drawing: A Cross-Sectional Analysis

Peer reviewed

Direct link

Hsieh, Wen-Min; Tsai, Chin-Chung – International Journal of Science Education, 2017

This cross-sectional study explored students' conceptions of science learning via drawing analysis. A total of 906 Taiwanese students in 4th, 6th, 8th, 10th, and 12th grade were asked to use drawing to illustrate how they conceptualise science learning. Students' drawings were analysed using a coding checklist to determine the presence or absence…

Descriptors: Science Instruction, Teaching Methods, Case Studies, Positive Attitudes

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13

Educational and Psychological…	11
Behavioral Research and…	9
Applied Psychological…	6
Psychological Assessment	6
Measurement in Physical…	5
Online Submission	5
Psychological Methods	3
Applied Measurement in…	2
Health Education & Behavior	2
Intelligence	2
International Journal of…	2
Issues in Educational Research	2
Journal of Applied…	2
Journal of Autism and…	2
Journal of Experimental…	2
Language Testing	2
Multivariate Behavioral…	2
Psychometrika	2
Research in Developmental…	2
Turkish Online Journal of…	2
ACT, Inc.	1
Adapted Physical Activity…	1
American Journal of…	1
American Journal of Evaluation	1
American Journal of Health…	1
More ▼

Alonzo, Julie	9
Tindal, Gerald	9
Lai, Cheng-Fei	8
Park, Bitnara Jasmine	7
Irvin, P. Shawn	6
Algina, James	2
Anderson, Daniel	2
Coniam, David	2
Cousineau, Denis	2
Feldt, Leonard S.	2
Laurencelle, Louis	2
Marcoulides, George A.	2
Marson, Stephen M.	2
Nese, Joseph F. T.	2
Raykov, Tenko	2
Wainer, Howard	2
Warne, Russell T.	2
Abbott, Robert	1
Abedi, Jamal	1
Abramowitz, Jonathan S.	1
Adams, Thomas	1
Addis, Gregg	1
Ahman, Sarah	1
Aihie, N. O.	1
More ▼

Stanford Achievement Tests	3
Advanced Placement…	2
ACT Assessment	1
ACT Interest Inventory	1
Classroom Environment Scale	1
Constructivist Learning…	1
Early Childhood Longitudinal…	1
Eysenck Personality Inventory	1
General Educational…	1
Inventory of Interpersonal…	1
Learning Environment Inventory	1
Minnesota Multiphasic…	1
Multidimensional Personality…	1
Myers Briggs Type Indicator	1
Questionnaire on Teacher…	1
SAT (College Admission Test)	1
Self Description Questionnaire	1
Torrance Tests of Creative…	1
Wechsler Adult Intelligence…	1
Wechsler Memory Scale	1
Wisconsin Card Sorting Test	1
More ▼