ERIC - Search Results

Publication Date

In 2025	0
Since 2024	4
Since 2021 (last 5 years)	12
Since 2016 (last 10 years)	28
Since 2006 (last 20 years)	69

Descriptor

Scoring	116
Student Evaluation	116
Interrater Reliability	51
Test Reliability	43
Reliability	37
Test Validity	33
Evaluation Methods	31
Writing Evaluation	31
Test Construction	22
Foreign Countries	21
Educational Assessment	20
Validity	19
Performance Based Assessment	17
Comparative Analysis	15
Scores	15
Measurement Techniques	14
Elementary Secondary Education	13
Scoring Rubrics	13
Essays	12
Evaluators	12
Testing	12
Computer Assisted Testing	11
Elementary School Students	11
College Students	10
English (Second Language)	10
More ▼

Publication Type

Journal Articles	65
Reports - Research	58
Reports - Evaluative	25
Reports - Descriptive	14
Speeches/Meeting Papers	12
Dissertations/Theses -…	5
Information Analyses	5
Numerical/Quantitative Data	5
Opinion Papers	5
Books	4
Guides - Classroom - Teacher	4
Tests/Questionnaires	3
Collected Works - General	2
Guides - Non-Classroom	2
Guides - Classroom - Learner	1
Guides - General	1
Reference Materials -…	1
More ▼

Education Level

Higher Education	26
Postsecondary Education	23
Elementary Secondary Education	15
Elementary Education	12
Secondary Education	10
Middle Schools	8
High Schools	7
Intermediate Grades	5
Grade 4	3
Grade 5	3
Grade 6	3
Primary Education	3
Early Childhood Education	2
Grade 7	2
Junior High Schools	2
Kindergarten	2
Grade 1	1
Grade 2	1
Grade 3	1
Grade 8	1
Grade 9	1
Two Year Colleges	1
More ▼

Audience

Teachers	6
Practitioners	4
Policymakers	1
Researchers	1
Students	1

Location

Australia	4
New York	4
United Kingdom (England)	4
Vermont	4
California	2
Connecticut	2
Japan	2
New Hampshire	2
New Mexico	2
Rhode Island	2
Turkey	2
District of Columbia	1
Germany	1
Hungary	1
Iran	1
Japan (Tokyo)	1
Nebraska	1
Netherlands	1
New Zealand	1
Pennsylvania	1
Singapore	1
Spain	1
Texas	1
United Kingdom (Scotland)	1
United States	1
More ▼

Laws, Policies, & Programs

Individuals with Disabilities…	3
Every Student Succeeds Act…	2
No Child Left Behind Act 2001	1
Race to the Top	1

Assessments and Surveys

National Assessment of…	3
Test of English as a Foreign…	3
New York State Regents…	2
Woodcock Johnson Tests of…	2
Cornell Critical Thinking Test	1
Graduate Management Admission…	1
International English…	1
Kaufman Test of Educational…	1
Keymath Diagnostic Arithmetic…	1
Michigan Test of English…	1
Test of English for…	1
Woodcock Munoz Language Survey	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 116 results Save | Export

Inter-Rater Reliability in Comprehensive Examination Scoring: The Case for Consistent and Collaborative Rater Training and Calibration

Download full text

Saenz, David Arron – Online Submission, 2023

There is a vast body of literature documenting the positive impacts that rater training and calibration sessions have on inter-rater reliability as research indicates several factors including frequency and timing play crucial roles towards ensuring inter-rater reliability. Additionally, increasing amounts research indicate possible links in…

Descriptors: Interrater Reliability, Scoring, Training, Scoring Rubrics

Computational Concepts and Their Assessment in Preschool Students: An Empirical Study

Peer reviewed

Direct link

Marcos Jiménez; María Zapata-Cáceres; Marcos Román-González; Gregorio Robles; Jesús Moreno-León; Estefanía Martín-Barroso – Journal of Science Education and Technology, 2024

Computational thinking (CT) is a multidimensional term that encompasses a wide variety of problem-solving skills related to the field of computer science. Unfortunately, standardized, valid, and reliable methods to assess CT skills in preschool children are lacking, compromising the reliability of the results reported in CT interventions. To…

Descriptors: Computation, Thinking Skills, Student Evaluation, Preschool Children

Selecting Technically Adequate Tests

Peer reviewed

Direct link

Susan K. Johnsen – Gifted Child Today, 2024

The author provides a checklist for educators who are selecting technically adequate tests for identifying and referring students for gifted education services and programs. The checklist includes questions related to how the test was normed, reliability and validity studies as well as questions related to types of scores, administration, and…

Descriptors: Test Selection, Academically Gifted, Gifted Education, Test Validity

Exploring an Alternative to Record Motor Competence Assessment: Interrater and Intrarater Audio-Video Reliability

Peer reviewed

Direct link

Cristina Menescardi; Aida Carballo-Fazanes; Núria Ortega-Benavent; Isaac Estevan – Journal of Motor Learning and Development, 2024

The Canadian Agility and Movement Skill Assessment (CAMSA) is a valid and reliable circuit-based test of motor competence which can be used to assess children's skills in a live or recorded performance and then coded. We aimed to analyze the intrarater reliability of the CAMSA scores (total, time, and skill score) and time measured, by comparing…

Descriptors: Interrater Reliability, Evaluators, Scoring, Psychomotor Skills

Rater Connections and the Detection of Bias in Performance Assessment

Peer reviewed

Direct link

Wind, Stefanie A. – Measurement: Interdisciplinary Research and Perspectives, 2022

In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the…

Descriptors: Evaluators, Bias, Identification, Performance Based Assessment

Evidence for Validity and Reliability of a Research-Based Assessment Instrument on Measurement Uncertainty

Peer reviewed

Direct link

Gayle Geschwind; Michael Vignal; Marcos D. Caballero; H.? J. Lewandowski – Physical Review Physics Education Research, 2024

The Survey of Physics Reasoning on Uncertainty Concepts in Experiments (SPRUCE) was designed to measure students' proficiency with measurement uncertainty concepts and practices across ten different assessment objectives to help facilitate the improvement of laboratory instruction focused on this important topic. To ensure the reliability and…

Descriptors: Measurement, Ambiguity (Context), Scientific Concepts, Physics

Automated Essay Scoring and the Deep Learning Black Box: How Are Rubric Scores Determined?

Peer reviewed

Direct link

Kumar, Vivekanandan S.; Boulanger, David – International Journal of Artificial Intelligence in Education, 2021

This article investigates the feasibility of using automated scoring methods to evaluate the quality of student-written essays. In 2012, Kaggle hosted an Automated Student Assessment Prize contest to find effective solutions to automated testing and grading. This article: a) analyzes the datasets from the contest -- which contained hand-graded…

Descriptors: Automation, Scoring, Essays, Writing Evaluation

Comparison of Computer Scoring Model Performance for Short Text Responses across Undergraduate Institutional Types

Peer reviewed

Direct link

Shiroda, Megan; Uhl, Juli D.; Urban-Lurain, Mark; Haudek, Kevin C. – Journal of Science Education and Technology, 2022

Constructed response (CR) assessments allow students to demonstrate understanding of complex topics and provide teachers with deeper insight into student thinking. Computer scoring models (CSMs) remove the barrier of increased time and effort, making CR more accessible. As CSMs are commonly created using responses from research-intensive colleges…

Descriptors: Responses, Student Evaluation, Scoring, Models

Development and Use of a Rubric to Assess Undergraduates' Problem Solutions in Physics

Peer reviewed
PDF on ERIC

Download full text

Kocakulah, Aysel – Participatory Educational Research, 2022

The aim of this study is to develop and apply a rubric to evaluate the solutions proposed for questions about electromagnetic induction belonging to university second year pre-service teachers. In this study which has pretest-posttest quasi-experimental design with control group, teaching of the topic of electromagnetic induction was applied to…

Descriptors: Scoring Rubrics, Student Evaluation, Undergraduate Students, Problem Solving

Partial Credit in Answer-Until-Correct Multiple-Choice Tests Deployed in a Classroom Setting

Peer reviewed

Direct link

Slepkov, Aaron D.; Godfrey, Alan T. K. – Applied Measurement in Education, 2019

The answer-until-correct (AUC) method of multiple-choice (MC) testing involves test respondents making selections until the keyed answer is identified. Despite attendant benefits that include improved learning, broad student adoption, and facile administration of partial credit, the use of AUC methods for classroom testing has been extremely…

Descriptors: Multiple Choice Tests, Test Items, Test Reliability, Scores

Investigation of Rater Tendencies and Reliability in Different Assessment Methods with Many Facet Rasch Model

Peer reviewed
PDF on ERIC

Download full text

Koçak, Duygu – International Electronic Journal of Elementary Education, 2020

One of the most commonly used methods for measuring higher-order thinking skills such as problem-solving or written expression is open-ended items. Three main approaches are used to evaluate responses to open-ended items: general evaluation, rating scales, and rubrics. In order to measure and improve problem-solving skills of students, firstly, an…

Descriptors: Interrater Reliability, Item Response Theory, Test Items, Rating Scales

Applying Generalizability Theory in Language Testing: Comparing Nested and Crossed Scoring Designs in the Assessment of Speaking Skills

Peer reviewed
PDF on ERIC

Download full text

Polat, Murat; Turhan, Nihan Sölpük – International Journal of Curriculum and Instruction, 2021

Scoring language learners' speaking skills is open to a number of measurement errors since raters' personal judgements could involve in the process. Different grading designs in which raters score a student's whole speaking skills or a specific dimension of the speaking performance could be settled to control and minimize the amount of the error…

Descriptors: Language Tests, Scoring, Speech Communication, State Universities

Assessment Literacy for Educators in a Hurry

Direct link

Popham, W. James – ASCD, 2018

What is assessment literacy? It is a handful of fundamental understandings about the testing concepts and procedures that influence educational decisions. And it just might be the most cost-effective means of real school improvement. With characteristic humor and aplomb, assessment expert W. James Popham strips away the psychometrician-speak and…

Descriptors: Student Evaluation, Educational Testing, Test Validity, Test Reliability

ITC Guidelines for the Large-Scale Assessment of Linguistically and Culturally Diverse Populations

Peer reviewed

Direct link

International Journal of Testing, 2019

These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…

Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage

Applying Kane's Validity Framework to a Simulation Based Assessment of Clinical Competence

Peer reviewed

Direct link

Tavares, Walter; Brydges, Ryan; Myre, Paul; Prpic, Jason; Turner, Linda; Yelle, Richard; Huiskamp, Maud – Advances in Health Sciences Education, 2018

Assessment of clinical competence is complex and inference based. Trustworthy and defensible assessment processes must have favourable evidence of validity, particularly where decisions are considered high stakes. We aimed to organize, collect and interpret validity evidence for a high stakes simulation based assessment strategy for certifying…

Descriptors: Competence, Simulation, Allied Health Personnel, Certification

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

ProQuest LLC	5
Journal of Technology,…	4
Applied Measurement in…	3
Assessing Writing	3
Advances in Health Sciences…	2
Educational Assessment	2
Journal of Science Education…	2
Measurement:…	2
National Center for Research…	2
New Mexico Public Education…	2
Online Submission	2
ASCD	1
Assessment & Evaluation in…	1
Assessment in Education:…	1
Assessment in Higher Education	1
College Board Review	1
Council of Chief State School…	1
Deafness and Education…	1
Early Education and…	1
Education and Training in…	1
Educational Horizons	1
Educational Measurement:…	1
Educational Research	1
Electronic Journal of Science…	1
English Teaching Forum	1
More ▼

Darling-Hammond, Linda	2
Gearhart, Maryl	2
Koretz, Daniel	2
Osmundson, Ellen	2
Shavelson, Richard J.	2
Ahmadi, Alireza	1
Aida Carballo-Fazanes	1
Alverez de Santizo, Myrna…	1
Amini, Mojtaba	1
Angoff, William H.	1
Anthony, Jason L.	1
Assel, Michael M.	1
Attali, Yigal	1
Ault, Marilyn	1
Baker, Eva L.	1
Barrett, Thomas J.	1
Bell, Daniel	1
Bell, John	1
Ben Seipel	1
Bird, Tom	1
Blackorby, Jose	1
Blok, H.	1
Bolaños, Daniel	1
Borich, Gary D.	1
More ▼