Publication Date
In 2025 | 0 |
Since 2024 | 3 |
Since 2021 (last 5 years) | 20 |
Since 2016 (last 10 years) | 65 |
Descriptor
Source
Author
Bauer, Malcolm I. | 2 |
Haudek, Kevin C. | 2 |
Isaacs, Talia | 2 |
Jin, Hui | 2 |
Moore, John C. | 2 |
Pressler, Yamina | 2 |
Yestness, Nissa | 2 |
van Rijn, Peter | 2 |
Abe, Mariko | 1 |
Alexander, Patricia A. | 1 |
Amanda Huee-Ping Wong | 1 |
More ▼ |
Publication Type
Education Level
Audience
Administrators | 1 |
Teachers | 1 |
Location
Turkey | 4 |
Australia | 2 |
China | 2 |
United Kingdom | 2 |
Brazil | 1 |
Cambodia | 1 |
China (Beijing) | 1 |
Colombia | 1 |
Finland | 1 |
Georgia | 1 |
Germany | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Kelly, Anthony – Assessment & Evaluation in Higher Education, 2023
The Research Excellence Framework is a high-stakes exercise used by the UK government to allocate billions of pounds of quality-related research (QR) funding and used by the media to rank universities and their departments in national league tables. The 2008, 2014 and 2021 assessments were zero-sum games in terms of league table position because…
Descriptors: Foreign Countries, Educational Assessment, Educational Research, Educational Quality
Daphna Harel; Dorothy Seaman; Jennifer Hill; Elisabeth King; Dana Burde – International Journal of Social Research Methodology, 2023
Indirect questioning attempts to overcome social desirability bias in survey research. However, to properly analyze the resulting data, it is crucial to understand how it impacts responses. This study analyzes results from a randomized experiment that tests whether direct versus indirect questioning methods lead to different results in a sample of…
Descriptors: Foreign Countries, Youth, Questioning Techniques, Language Usage
Saenz, David Arron – Online Submission, 2023
There is a vast body of literature documenting the positive impacts that rater training and calibration sessions have on inter-rater reliability as research indicates several factors including frequency and timing play crucial roles towards ensuring inter-rater reliability. Additionally, increasing amounts research indicate possible links in…
Descriptors: Interrater Reliability, Scoring, Training, Scoring Rubrics
Reagan Mozer; Luke Miratrix – Society for Research on Educational Effectiveness, 2023
Background: For randomized trials that use text as an outcome, traditional approaches for assessing treatment impact require each document first be manually coded for constructs of interest by trained human raters. These hand-coded scores are then used as a measured outcome for an impact analysis, with the average scores of the treatment group…
Descriptors: Artificial Intelligence, Coding, Randomized Controlled Trials, Research Methodology
Office of Educational Technology, US Department of Education, 2023
The U.S. Department of Education (Department) is committed to supporting the use of technology to improve teaching and learning and to support innovation throughout educational systems. This report addresses the clear need for sharing knowledge and developing policies for "Artificial Intelligence," a rapidly advancing class of…
Descriptors: Artificial Intelligence, Educational Technology, Technology Uses in Education, Educational Policy
Casabianca, Jodi M.; Donoghue, John R.; Shin, Hyo Jeong; Chao, Szu-Fu; Choi, Ikkyu – Journal of Educational Measurement, 2023
Using item-response theory to model rater effects provides an alternative solution for rater monitoring and diagnosis, compared to using standard performance metrics. In order to fit such models, the ratings data must be sufficiently connected in order to estimate rater effects. Due to popular rating designs used in large-scale testing scenarios,…
Descriptors: Item Response Theory, Alternative Assessment, Evaluators, Research Problems
Laura K. Allen; Arthur C. Grasser; Danielle S. McNamara – Grantee Submission, 2023
Assessments of natural language can provide vast information about individuals' thoughts and cognitive process, but they often rely on time-intensive human scoring, deterring researchers from collecting these sources of data. Natural language processing (NLP) gives researchers the opportunity to implement automated textual analyses across a…
Descriptors: Psychological Studies, Natural Language Processing, Automation, Research Methodology
Chenchen Ma; Jing Ouyang; Chun Wang; Gongjun Xu – Grantee Submission, 2024
Survey instruments and assessments are frequently used in many domains of social science. When the constructs that these assessments try to measure become multifaceted, multidimensional item response theory (MIRT) provides a unified framework and convenient statistical tool for item analysis, calibration, and scoring. However, the computational…
Descriptors: Algorithms, Item Response Theory, Scoring, Accuracy
Glazer, Nancy; Wolfe, Edward W. – Applied Measurement in Education, 2020
This introductory article describes how constructed response scoring is carried out, particularly the rater monitoring processes and illustrates three potential designs for conducting rater monitoring in an operational scoring project. The introduction also presents a framework for interpreting research conducted by those who study the constructed…
Descriptors: Scoring, Test Format, Responses, Predictor Variables
Rodgers, Wendy J.; Morris-Mathews, Hannah; Romig, John Elwood; Bettini, Elizabeth – Review of Educational Research, 2022
Classroom observation research plays an important role in policy, practice, and scholarship for students with disabilities. When interpreting results of observation studies, it is important to consider the validity evidence provided by researchers and how that speaks to the intended use of those results. In this literature synthesis, we used…
Descriptors: Special Education, Validity, Classroom Research, Students with Disabilities
Curran, Patrick J.; Georgeson, A. R.; Bauer, Daniel J.; Hussong, Andrea M. – International Journal of Behavioral Development, 2021
Conducting valid and reliable empirical research in the prevention sciences is an inherently difficult and challenging task. Chief among these is the need to obtain numerical scores of underlying theoretical constructs for use in subsequent analysis. This challenge is further exacerbated by the increasingly common need to consider multiple…
Descriptors: Psychometrics, Scoring, Prevention, Scores
Soland, James – Educational and Psychological Measurement, 2022
Considerable thought is often put into designing randomized control trials (RCTs). From power analyses and complex sampling designs implemented preintervention to nuanced quasi-experimental models used to estimate treatment effects postintervention, RCT design can be quite complicated. Yet when psychological constructs measured using survey scales…
Descriptors: Item Response Theory, Surveys, Scoring, Randomized Controlled Trials
Seedhouse, Paul; Satar, Müge – Classroom Discourse, 2023
The same L2 speaking performance may be analysed and evaluated in very different ways by different teachers or raters. We present a new, technology-assisted research design which opens up to investigation the trajectories of convergence and divergence between raters. We tracked and recorded what different raters noticed when, whilst grading a…
Descriptors: Language Tests, English (Second Language), Second Language Learning, Oral Language
Boztunç Öztürk, Nagihan; Sahin, Melek Gülsah; Ilhan, Mustafa – Turkish Journal of Education, 2019
The aim of this research was to analyze and compare analytic rubric and general impression scoring in peer assessment. A total of 66 university students participated in the study and six of them were chosen as peer raters on a voluntary basis. In the research, students were supposed to prepare a sample study within the scope of scientific research…
Descriptors: Foreign Countries, College Students, Student Evaluation, Peer Evaluation
Regional Educational Laboratory Mid-Atlantic, 2024
These are the appendixes for the report, "Strengthening the Pennsylvania School Climate Survey to Inform School Decisionmaking." This study analyzed Pennsylvania School Climate Survey data from students and staff in the 2021/22 school year to assess the validity and reliability of the elementary school student version of the survey;…
Descriptors: Educational Environment, Surveys, Decision Making, School Personnel