ERIC - Search Results

Publication Date

In 2025	2
Since 2024	33
Since 2021 (last 5 years)	134
Since 2016 (last 10 years)	455
Since 2006 (last 20 years)	1164

Descriptor

Comparative Analysis	1930
Reliability	873
Test Reliability	787
Foreign Countries	547
Test Validity	442
Correlation	345
Validity	330
Interrater Reliability	325
Statistical Analysis	321
Scores	274
Measures (Individuals)	236
Evaluation Methods	209
Higher Education	201
Psychometrics	180
Questionnaires	165
Factor Analysis	161
Test Construction	159
College Students	157
English (Second Language)	145
Student Attitudes	140
Test Items	131
Second Language Learning	130
Scoring	127
Rating Scales	125
Student Evaluation	125
More ▼

Education Level

Higher Education	353
Postsecondary Education	278
Secondary Education	147
Elementary Education	134
Elementary Secondary Education	73
High Schools	67
Middle Schools	61
Early Childhood Education	41
Junior High Schools	34
Grade 8	29
Preschool Education	25
Grade 7	24
Intermediate Grades	24
Grade 4	22
Grade 5	20
Grade 6	20
Kindergarten	20
Primary Education	20
Adult Education	19
Grade 10	16
Grade 11	12
Grade 12	10
Grade 2	10
Grade 3	10
Grade 9	10
More ▼

Audience

Researchers	35
Practitioners	29
Teachers	15
Administrators	9
Policymakers	6
Counselors	2
Media Staff	2
Parents	1
Support Staff	1

Location

Turkey	59
United States	47
Australia	36
Canada	32
United Kingdom (England)	32
China	31
United Kingdom	28
Germany	25
Netherlands	24
Taiwan	22
Hong Kong	20
Iran	18
Spain	17
Belgium	15
California	15
Florida	13
Finland	12
Greece	12
Sweden	12
Texas	12
Indonesia	11
Malaysia	11
Portugal	11
Japan	10
Jordan	10
More ▼

Laws, Policies, & Programs

No Child Left Behind Act 2001	6
Every Student Succeeds Act…	2
Individuals with Disabilities…	2
Americans with Disabilities…	1
Comprehensive Employment and…	1
Improving Americas Schools…	1
Individuals with Disabilities…	1
Race to the Top	1
Temporary Assistance for…	1

What Works Clearinghouse Rating

Meets WWC Standards with or without Reservations	1
Does not meet standards	1

Comparative Analysis X

Showing 61 to 75 of 1,930 results Save | Export

Developing a Single-Item General Self-Efficacy Scale: An Initial Study

Peer reviewed

Direct link

Di, Weiwei; Nie, Youyan; Chua, Bee Leng; Chye, Stefanie; Teo, Timothy – Journal of Psychoeducational Assessment, 2023

General self-efficacy represents the global sense of personal capability across various situations and tasks. The aim of the present study was to develop and validate a single-item general self-efficacy scale which balances practical demands and psychometric concerns. The psychometric properties of the proposed Single-Item General Self-Efficacy…

Descriptors: Self Efficacy, Self Concept Measures, Psychometrics, Adults

How Do Judges in Comparative Judgement Exercises Make Their Judgements?

Download full text

Leech, Tony; Chambers, Lucy – Research Matters, 2022

Two of the central issues in comparative judgement (CJ), which are perhaps underexplored compared to questions of the method's reliability and technical quality, are "what processes do judges use to make their decisions" and "what features do they focus on when making their decisions?" This article discusses both, in the…

Descriptors: Comparative Analysis, Decision Making, Evaluators, Reliability

Treatments of Differential Item Functioning: A Comparison of Four Methods

Peer reviewed

Direct link

Liu, Xiaowen; Jane Rogers, H. – Educational and Psychological Measurement, 2022

Test fairness is critical to the validity of group comparisons involving gender, ethnicities, culture, or treatment conditions. Detection of differential item functioning (DIF) is one component of efforts to ensure test fairness. The current study compared four treatments for items that have been identified as showing DIF: deleting, ignoring,…

Descriptors: Item Analysis, Comparative Analysis, Culture Fair Tests, Test Validity

Variable Height Step Test Provides Reliable Heart Rate Values during Virtual Cardiorespiratory Fitness Testing

Peer reviewed

Direct link

Matthews, Evan L.; Horvat, Fiona M.; Phillips, David A. – Measurement in Physical Education and Exercise Science, 2022

The YMCA step test uses a prescribed step height which is difficult in a telehealth setting. Examine a modification of the YMCA step test allowing for the use of preexisting in-home objects of variable height as the "step" in a virtual environment. Young healthy participants (n = 40) performed step tests with a small and large object of…

Descriptors: Physical Fitness, Metabolism, Tests, Measurement

Kappa Coefficients for Missing Data

Peer reviewed

Direct link

De Raadt, Alexandra; Warrens, Matthijs J.; Bosker, Roel J.; Kiers, Henk A. L. – Educational and Psychological Measurement, 2019

Cohen's kappa coefficient is commonly used for assessing agreement between classifications of two raters on a nominal scale. Three variants of Cohen's kappa that can handle missing data are presented. Data are considered missing if one or both ratings of a unit are missing. We study how well the variants estimate the kappa value for complete data…

Descriptors: Interrater Reliability, Data, Statistical Analysis, Statistical Bias

The Intersection of AI and Language Assessment: A Study on the Reliability of ChatGPT in Grading IELTS Writing Task 2

Peer reviewed
PDF on ERIC

Download full text

Osama Koraishi – Language Teaching Research Quarterly, 2024

This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Artificial Intelligence

Integration of Interactive Computer Simulations in Teaching and Learning Chemical Reaction: Students' Performance and Concept Retention

Peer reviewed
PDF on ERIC

Download full text

Jane Batamuliza; Gonzague Habinshuti; Jean Baptiste Nkurunziza – Journal of Technology and Science Education, 2024

This current study presents the effects of interactive computer simulations on students' performance and concept retention in the unit of chemical reactions. Purposive sampling was used to select four schools with a sample population of 320. The Achievement test on chemical reactions was developed, validated, and checked for reliability. The…

Descriptors: Chemistry, Science Instruction, Teaching Methods, Comparative Analysis

Investigating the Measurement of OTL in PISA 2012 and Its Relationship with Self-Efficacy and Mathematics Achievement: Doubly Latent Multilevel Analyses

Peer reviewed

Direct link

Wang, Faming; Wang, Yehui; Liu, Yaping; Leung, Shing On – Scandinavian Journal of Educational Research, 2023

The importance of the opportunity to learn (OTL) for mathematics achievement has been extensively researched. However, there were still unanswered questions regarding OTL's measurement, analytical level, and relationship with motivational beliefs. To fill in the gaps, we aimed to (1) scrutinize the reliability and validity of OTL, (2) investigate…

Descriptors: International Assessment, Foreign Countries, Achievement Tests, Secondary School Students

Meta-Analysis of Inter-Rater Agreement and Discrepancy Between Human and Automated English Essay Scoring

Peer reviewed
PDF on ERIC

Download full text

Direct link

Jiyeo Yun – English Teaching, 2023

Studies on automatic scoring systems in writing assessments have also evaluated the relationship between human and machine scores for the reliability of automated essay scoring systems. This study investigated the magnitudes of indices for inter-rater agreement and discrepancy, especially regarding human and machine scoring, in writing assessment.…

Descriptors: Meta Analysis, Interrater Reliability, Essays, Scoring

Crowdsourced Adaptive Comparative Judgment: A Community-Based Solution for Proficiency Rating

Peer reviewed

Direct link

Paquot, Magali; Rubin, Rachel; Vandeweerd, Nathan – Language Learning, 2022

The main objective of this Methods Showcase Article is to show how the technique of adaptive comparative judgment, coupled with a crowdsourcing approach, can offer practical solutions to reliability issues as well as to address the time and cost difficulties associated with a text-based approach to proficiency assessment in L2 research. We…

Descriptors: Comparative Analysis, Decision Making, Language Proficiency, Reliability

Item Response Theory, Computer Adaptive Testing and the Risk of Self-Deception

Download full text

Benton, Tom – Research Matters, 2021

Computer adaptive testing is intended to make assessment more reliable by tailoring the difficulty of the questions a student has to answer to their level of ability. Most commonly, this benefit is used to justify the length of tests being shortened whilst retaining the reliability of a longer, non-adaptive test. Improvements due to adaptive…

Descriptors: Risk, Item Response Theory, Computer Assisted Testing, Difficulty Level

Depth-Perception-Based Representation in Holistic Rating on ESL Essay Writing

Peer reviewed

Direct link

Lian Li; Jiehui Hu; Yu Dai; Ping Zhou; Wanhong Zhang – Reading & Writing Quarterly, 2024

This paper proposes to use depth perception to represent raters' decision in holistic evaluation of ESL essays, as an alternative medium to conventional form of numerical scores. The researchers verified the new method's accuracy and inter/intra-rater reliability by inviting 24 ESL teachers to perform different representations when rating 60…

Descriptors: Essays, Holistic Approach, Writing Evaluation, Accuracy

Validity and Reliability of Student Perceptions of Teaching Quality in Primary Education

Peer reviewed

Direct link

van der Scheer, Emmelien A.; Bijlsma, Hannah J. E.; Glas, Cees A. W. – School Effectiveness and School Improvement, 2019

A Bayesian IRT-model approach was used to investigate the validity and reliability of student perceptions of teaching quality. Furthermore, the student perceptions were compared with ratings of teaching quality by external observers. Grade 4 students (n = 675) filled out a questionnaire that was used to measure their opinions about the lessons of…

Descriptors: Student Attitudes, Validity, Interrater Reliability, Correlation

The Effect of Gersmehl's Spatial Learning on Students' Disaster Spatial Literacy

Peer reviewed
PDF on ERIC

Download full text

Purwanto; Hidayah, Niswatul; Wagistina, Satti – International Journal of Educational Methodology, 2023

Learning geography in Indonesia philosophically aims to develop spatial literacy. Students must improve spatial literacy to form reasoning skills and apply spatial concepts in real life. Applying Gersmehl's spatial learning can improve students' spatial literacy through syntax arranged based on spatial aspects. The use of google earth helps…

Descriptors: Spatial Ability, Natural Disasters, Geography Instruction, Teaching Methods

Reliability and Validity of Methods to Assess Undergraduate Healthcare Student Performance in Pharmacology: Comparison of Open Book versus Time-Limited Closed Book Examinations

Peer reviewed
PDF on ERIC

Download full text

David Bell; Vikki O'Neill; Vivienne Crawford – Practitioner Research in Higher Education, 2023

We compared the influence of open-book extended duration versus closed book time-limited format on reliability and validity of written assessments of pharmacology learning outcomes within our medical and dental courses. Our dental cohort undertake a mid-year test (30xfree-response short answer to a question, SAQ) and end-of-year paper (4xSAQ,…

Descriptors: Undergraduate Students, Pharmacology, Pharmaceutical Education, Test Format

« Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 129

Educational and Psychological…	64
ProQuest LLC	58
Journal of Speech, Language,…	31
Online Submission	27
Journal of Educational…	21
Measurement in Physical…	21
Language Testing	19
ETS Research Report Series	17
Journal of Autism and…	16
Journal of Psychoeducational…	16
Educational Research and…	15
Assessment & Evaluation in…	14
Measurement and Evaluation in…	14
Psychology in the Schools	14
Journal of Consulting and…	12
International Education…	11
Journal of Education and…	11
Psychological Assessment	11
Research in Developmental…	11
Applied Measurement in…	10
Applied Psychological…	10
Educational Sciences: Theory…	10
Advances in Health Sciences…	9
Assessment in Education:…	9
Psychometrika	9
More ▼

Reckase, Mark D.	6
Attali, Yigal	5
Coniam, David	5
Brennan, Robert L.	4
Crehan, Kevin D.	4
Feldt, Leonard S.	4
Hakstian, A. Ralph	4
Jones, Ian	4
Kolen, Michael J.	4
Lunz, Mary E.	4
August, Diane	3
Bashaw, W. L.	3
Bennett, Randy Elliot	3
Benson, Jeri	3
Betz, Nancy E.	3
Ebel, Robert L.	3
Fletcher, Jack M.	3
Francis, David J.	3
Frisbie, David A.	3
Haberman, Shelby	3
Haladyna, Tom	3
Hambleton, Ronald K.	3
Henk, William A.	3
Iwata, Brian A.	3
More ▼

Journal Articles	1354
Reports - Research	1321
Reports - Evaluative	286
Speeches/Meeting Papers	165
Tests/Questionnaires	79
Reports - Descriptive	63
Dissertations/Theses -…	60
Information Analyses	55
Opinion Papers	30
Numerical/Quantitative Data	19
Collected Works - General	8
Books	7
Collected Works - Proceedings	5
Guides - Non-Classroom	5
Book/Product Reviews	4
Dissertations/Theses -…	4
Collected Works - Serials	3
Guides - General	2
Collected Works - Serial	1
Dissertations/Theses	1
Guides - Classroom - Teacher	1
Historical Materials	1
Non-Print Media	1
Reference Materials -…	1
Reference Materials - General	1
More ▼

Wechsler Intelligence Scale…	16
Peabody Picture Vocabulary…	13
Woodcock Johnson Tests of…	11
SAT (College Admission Test)	10
Test of English as a Foreign…	10
Wechsler Adult Intelligence…	10
Program for International…	9
Minnesota Multiphasic…	8
National Assessment of…	8
Torrance Tests of Creative…	7
Trends in International…	7
Wide Range Achievement Test	7
Autism Diagnostic Observation…	6
Raven Progressive Matrices	5
Self Directed Search	5
ACT Assessment	4
Center for Epidemiologic…	4
Dynamic Indicators of Basic…	4
Early Childhood Environment…	4
General Educational…	4
Graduate Record Examinations	4
Iowa Tests of Basic Skills	4
Metropolitan Achievement Tests	4
Rosenberg Self Esteem Scale	4
Social Skills Rating System	4
More ▼