ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	8
Since 2006 (last 20 years)	18

Descriptor

Interrater Reliability	25
Sampling	25
Scores	8
Scoring	8
Correlation	7
Generalizability Theory	5
Statistical Analysis	5
Data Collection	4
Error of Measurement	4
Evaluation Methods	4
Observation	4
Research Methodology	4
Data Analysis	3
Educational Assessment	3
Evidence	3
Foreign Countries	3
Measurement Techniques	3
Performance Based Assessment	3
Questionnaires	3
Research Design	3
Test Reliability	3
Academic Achievement	2
Classification	2
Coding	2
Comparative Analysis	2
More ▼

Source

ETS Research Report Series	2
Educational Measurement:…	2
Psychometrika	2
Action in Teacher Education	1
Applied Measurement in…	1
Behavior Modification	1
Education and Treatment of…	1
Educational and Psychological…	1
Evaluation and Research in…	1
Frontline Learning Research	1
Journal of Experimental…	1
Journal of MultiDisciplinary…	1
OECD Publishing (NJ1)	1
ProQuest LLC	1
Regional Educational…	1
Sage Research Methods Cases	1
School Psychology Quarterly	1
TESOL Quarterly: A Journal…	1
More ▼

Publication Type

Journal Articles	17
Reports - Research	14
Reports - Evaluative	7
Reports - Descriptive	2
Speeches/Meeting Papers	2
Books	1
Dissertations/Theses -…	1
Guides - Non-Classroom	1
Information Analyses	1
Non-Print Media	1
Numerical/Quantitative Data	1
Opinion Papers	1
Tests/Questionnaires	1
More ▼

Education Level

Higher Education	6
Postsecondary Education	4
Elementary Education	2
Early Childhood Education	1
Grade 1	1
Grade 2	1
Grade 3	1
Primary Education	1

Audience

Researchers

Location

China	1
Germany	1

Laws, Policies, & Programs

Assessments and Surveys

Graduate Record Examinations	1
Program for International…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 25 results Save | Export

New Tests of Rater Drift in Trend Scoring

Peer reviewed

Direct link

John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024

Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…

Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics

A Unified Approach to Estimating the Intraclass Correlation Coefficient and Its Bias: An Exploratory Study

Direct link

Kelvin Terrell Pompey – ProQuest LLC, 2021

Many methods are used to measure interrater reliability for studies where each target receives ratings by a different set of judges. The purpose of this study is to explore the use of hierarchical modeling for estimating interrater reliability using the intraclass correlation coefficient. This study provides a description of how the ICC can be…

Descriptors: Interrater Reliability, Evaluation Methods, Test Reliability, Correlation

Boolean Analysis of Interobserver Agreement: Formal and Functional Evidence Sampling in Complex Coding Endeavors

Peer reviewed

Direct link

Solano-Flores, Guillermo – Educational Measurement: Issues and Practice, 2021

This article proposes a Boolean approach to representing and analyzing interobserver agreement in dichotomous coding. Building on the notion that observations are samples of a universe of observations, it submits that coding can be viewed as a process in which observers sample pieces of evidence on constructs. It distinguishes between formal and…

Descriptors: Online Searching, Coding, Interrater Reliability, Evidence

Investigating Constructed-Response Scoring over Time: The Effects of Study Design on Trend Rescore Statistics. Research Report. ETS RR-22-15

Peer reviewed
PDF on ERIC

Download full text

Donoghue, John R.; McClellan, Catherine A.; Hess, Melinda R. – ETS Research Report Series, 2022

When constructed-response items are administered for a second time, it is necessary to evaluate whether the current Time B administration's raters have drifted from the scoring of the original administration at Time A. To study this, Time A papers are sampled and rescored by Time B scorers. Commonly the scores are compared using the proportion of…

Descriptors: Item Response Theory, Test Construction, Scoring, Testing

Disentangling Objective Characteristics of Learning Situations from Subjective Perceptions Thereof, Using an Experience Sampling Method Design

Peer reviewed
PDF on ERIC

Download full text

Moeller, Julia; Viljaranta, Jaana; Kracke, Bärbel; Dietrich, Julia – Frontline Learning Research, 2020

This article proposes a study design developed to disentangle the objective characteristics of a learning situation from individuals' subjective perceptions of that situation. The term objective characteristics refers to the agreement across students, whereas subjective perceptions refers to inter-individual heterogeneity. We describe a novel…

Descriptors: Student Attitudes, College Students, Lecture Method, Student Interests

Dependability of Data Derived from Time Sampling Methods with Multiple Observation Targets

Peer reviewed

Direct link

Johnson, Austin H.; Chafouleas, Sandra M.; Briesch, Amy M. – School Psychology Quarterly, 2017

In this study, generalizability theory was used to examine the extent to which (a) time-sampling methodology, (b) number of simultaneous behavior targets, and (c) individual raters influenced variance in ratings of academic engagement for an elementary-aged student. Ten graduate-student raters, with an average of 7.20 hr of previous training in…

Descriptors: Generalizability Theory, Sampling, Elementary School Students, Learner Engagement

Assessment of a Rotating Time Sampling Procedure: Implications for Interobserver Agreement and Response Measurement

Peer reviewed

Direct link

Becraft, Jessica L.; Borrero, John C.; Davis, Barbara J.; Mendres-Smith, Amber E. – Education and Treatment of Children, 2016

The current study was designed to evaluate a rotating momentary time sampling (MTS) data collection system. A rotating MTS system has been used to measure activity preferences of preschoolers but not to collect data on responses that vary in duration and frequency (e.g., talking). We collected data on talking for 10 preschoolers using a 5-s MTS…

Descriptors: Sampling, Time, Interrater Reliability, Data Collection

Methodological Considerations in Conducting Research on Parent Child Cognitive Learning Interactions. Sage Research Methods Cases Part 2

Direct link

Yvette R. Harris – Sage Research Methods Cases, 2016

The goal of this case study was to introduce students to ways to conduct research on parent child cognitive learning interactions. To this end, the case study begins with an overview of the theoretical and empirical work supporting the development of my research program on parent child cognitive learning interaction research and continues with a…

Descriptors: Student Research, Parent Child Relationship, Interaction, Sampling

A Comparison of Reliability Measures for Continuous and Discontinuous Recording Methods: Inflated Agreement Scores with Partial Interval Recording and Momentary Time Sampling for Duration Events

Peer reviewed

Direct link

Rapp, John T.; Carroll, Regina A.; Stangeland, Lindsay; Swanson, Greg; Higgins, William J. – Behavior Modification, 2011

The authors evaluated the extent to which interobserver agreement (IOA) scores, using the block-by-block method for events scored with continuous duration recording (CDR), were higher when the data from the same sessions were converted to discontinuous methods. Sessions with IOA scores of 89% or less with CDR were rescored using 10-s partial…

Descriptors: Intervals, Sampling, Comparative Analysis, Measures (Individuals)

The Impact of Sampling Approach on Population Invariance in Automated Scoring of Essays. Research Report. ETS RR-13-18

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo – ETS Research Report Series, 2013

Many testing programs use automated scoring to grade essays. One issue in automated essay scoring that has not been examined adequately is population invariance and its causes. The primary purpose of this study was to investigate the impact of sampling in model calibration on population invariance of automated scores. This study analyzed scores…

Descriptors: Automation, Scoring, Essay Tests, Sampling

Reporting What Readers Need to Know about Education Research Measures: A Guide. REL 2014-064

Peer reviewed
PDF on ERIC

Download full text

Boller, Kimberly; Kisker, Ellen Eliason – Regional Educational Laboratory, 2014

This guide is designed to help researchers make sure that their research reports include enough information about study measures so that readers can assess the quality of the study's methods and results. The guide also provides examples of write-ups about measures and suggests resources for learning more about these topics. The guide assumes…

Descriptors: Research Reports, Research Methodology, Educational Research, Check Lists

Utilizing Generalizability Theory to Investigate the Reliability of the Grades Assigned to Undergraduate Research Papers

Peer reviewed

Direct link

Gugiu, Mihaiela R.; Gugiu, Paul C.; Baldus, Robert – Journal of MultiDisciplinary Evaluation, 2012

Background: Educational researchers have long espoused the virtues of writing with regard to student cognitive skills. However, research on the reliability of the grades assigned to written papers reveals a high degree of contradiction, with some researchers concluding that the grades assigned are very reliable whereas others suggesting that they…

Descriptors: Grades (Scholastic), Grading, Scoring Rubrics, Research Design

Leveraging Data Sampling and Practical Knowledge: Field Instructors' Perceptions about Inter-Rater Reliability Data

Peer reviewed

Direct link

Soslau, Elizabeth; Lewis, Kandia – Action in Teacher Education, 2014

For accreditation and programmatic decision making, education school administrators use inter-rater reliability analyses to judge credibility of student-teacher assessments. Although weak levels of agreement between university-appointed supervisors and cooperating teachers are usually interpreted to indicate that the process is not being…

Descriptors: Interrater Reliability, Accreditation (Institutions), Student Teacher Evaluation, Focus Groups

Effects of the Manipulation of Cognitive Processes on EFL Writers' Text Quality

Peer reviewed

Direct link

Ong, Justina; Zhang, Lawrence Jun – TESOL Quarterly: A Journal for Teachers of English to Speakers of Other Languages and of Standard English as a Second Dialect, 2013

Little is known about the effects of various planning and revising conditions on composition quality in experimental or TESOL education research. This study examined the effects of planning conditions (planning, prolonged planning, free writing, and control), subplanning conditions (task-given, task-content-given, and…

Descriptors: English (Second Language), Second Language Learning, Cognitive Processes, Writing (Composition)

Variance Estimation of Nominal-Scale Inter-Rater Reliability with Random Selection of Raters

Peer reviewed

Direct link

Gwet, Kilem Li – Psychometrika, 2008

Most inter-rater reliability studies using nominal scales suggest the existence of two populations of inference: the population of subjects (collection of objects or persons to be rated) and that of raters. Consequently, the sampling variance of the inter-rater reliability coefficient can be seen as a result of the combined effect of the sampling…

Descriptors: Interrater Reliability, Computation, Statistical Inference, Sampling

Previous Page | Next Page »

Pages: 1 | 2

Baldus, Robert	1
Becraft, Jessica L.	1
Boller, Kimberly	1
Borrero, John C.	1
Briesch, Amy M.	1
Carol Eckerly	1
Carroll, Regina A.	1
Chafouleas, Sandra M.	1
Chen, Michael	1
Collins, Kathleen M. T.	1
Cook, Colleen	1
Davis, Barbara J.	1
Dietrich, Julia	1
Donoghue, John R.	1
Fan, Xitao	1
Flack, Virginia F.	1
Gugiu, Mihaiela R.	1
Gugiu, Paul C.	1
Gwet, Kilem Li	1
Hartman, Bruce W.	1
Hess, Melinda R.	1
Higgins, William J.	1
Huang, Chiungjung	1
Jaeger, Richard M.	1
More ▼