Publication Date
In 2025 | 2 |
Since 2024 | 15 |
Since 2021 (last 5 years) | 68 |
Since 2016 (last 10 years) | 171 |
Since 2006 (last 20 years) | 439 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 28 |
Practitioners | 2 |
Policymakers | 1 |
Students | 1 |
Location
Turkey | 14 |
Canada | 10 |
United States | 10 |
California | 9 |
Netherlands | 9 |
Australia | 6 |
Germany | 6 |
South Korea | 6 |
Iowa | 5 |
Norway | 5 |
Turkey (Ankara) | 5 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 2 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Tate, Kevin A.; Rivera, Edil Torres; Conwill, William L.; Miller, M. David; Puig, Ana – Journal for Specialists in Group Work, 2013
There is a clear call in group counseling practice and training for evidence-based practice (ACA, 2005; ASGW, 2008; CACREP, 2009). At the same time, group counselors also are asked to keep clients' experience at the center of their work (ASGW, 2012). This article outlines the authors' effort to develop and study an instrument designed to measure…
Descriptors: Evidence, Group Dynamics, Construct Validity, Group Counseling
Lin, Huifen – Research-publishing.net, 2012
For the past few decades, instructional materials enriched with multimedia elements have enjoyed increasing popularity. Multimedia-based instruction incorporating stimulating visuals, authentic audios, and interactive animated graphs of different kinds all provide additional and valuable opportunities for students to learn beyond what conventional…
Descriptors: Multimedia Materials, Multimedia Instruction, Vocabulary Development, Reading Comprehension
Bunce, Diane M.; VandenPlas, Jessica R.; Neiles, Kelly Y.; Flens, Elizabeth A. – Journal of College Science Teaching, 2010
Development of a research instrument to measure student achievement requires planning and reliability and validity testing before the instrument is used to collect data. These steps are often overlooked in research studies, but when the instrument is to be used across a wider population, the inclusion of these steps is vital to address the…
Descriptors: Academic Achievement, Measures (Individuals), Science Process Skills, Test Reliability
Raymond, Mark R.; Clauser, Brian E.; Furman, Gail E. – Advances in Health Sciences Education, 2010
The use of standardized patients to assess communication skills is now an essential part of assessing a physician's readiness for practice. To improve the reliability of communication scores, it has become increasingly common in recent years to use statistical models to adjust ratings provided by standardized patients. This study employed ordinary…
Descriptors: Generalizability Theory, Physicians, Patients, Least Squares Statistics
Kammeyer-Mueller, John; Steel, Piers D. G.; Rubenstein, Alex – Multivariate Behavioral Research, 2010
Common source bias has been the focus of much attention. To minimize the problem, researchers have sometimes been advised to take measurements of predictors from one observer and measurements of outcomes from another observer or to use separate occasions of measurement. We propose that these efforts to eliminate biases due to common source…
Descriptors: Statistical Bias, Predictor Variables, Measurement, Data Collection
Boyd, Donald; Lankford, Hamilton; Loeb, Susanna; Wyckoff, James – Journal of Educational and Behavioral Statistics, 2013
Test-based accountability as well as value-added asessments and much experimental and quasi-experimental research in education rely on achievement tests to measure student skills and knowledge. Yet, we know little regarding fundamental properties of these tests, an important example being the extent of measurement error and its implications for…
Descriptors: Accountability, Educational Research, Educational Testing, Error of Measurement
Johnson, Evelyn S.; Semmelroth, Carrie L. – Journal of Special Education Apprenticeship, 2012
This paper reports the results of interrater agreement analyses on a pilot special education teacher evaluation instrument, the Recognizing Effective Special Education Teachers (RESET) Observation Tool (OT). Using evidence-based instructional practices as the basis for the evaluation, the RESET OT is designed for the spectrum of different…
Descriptors: Interrater Reliability, Pilot Projects, Special Education, Special Education Teachers
Mashburn, Andrew J.; Meyer, J. Patrick; Allen, Joseph P.; Pianta, Robert C. – Educational and Psychological Measurement, 2014
Observational methods are increasingly being used in classrooms to evaluate the quality of teaching. Operational procedures for observing teachers are somewhat arbitrary in existing measures and vary across different instruments. To study the effect of different observation procedures on score reliability and validity, we conducted an experimental…
Descriptors: Observation, Teacher Evaluation, Reliability, Validity
Jeon, Min-Jeong; Lee, Guemin; Hwang, Jeong-Won; Kang, Sang-Jin – Asia Pacific Education Review, 2009
The purpose of this study was to investigate the methods of estimating the reliability of school-level scores using generalizability theory and multilevel models. Two approaches, "student within schools" and "students within schools and subject areas," were conceptualized and implemented in this study. Four methods resulting from the combination…
Descriptors: Generalizability Theory, Scores, Reliability, Statistical Analysis
Ure, Abigail C. – ProQuest LLC, 2011
This study investigated how 2 different rating conditions, the controlled rating condition (CRC) and the uncontrolled rating condition (URC), effected rater behavior and the reliability of a performance assessment (PA) known as the Missionary Teaching Assessment (MTA). The CRC gives raters the capability to manipulate (pause, rewind, fast-forward)…
Descriptors: Teacher Evaluation, Performance Based Assessment, Performance Tests, Generalizability Theory
Lewis, Scott E.; Shaw, Janet L.; Freeman, Kathryn A. – Chemistry Education Research and Practice, 2011
Open-ended assessments, defined as assessments with a large set of possible correct answers, by nature lend themselves to concerns regarding accurate and consistent grading. This article describes one particular open-ended assessment, named Creative Exercises (CE), designed for promoting students' interconnection of concepts in a college general…
Descriptors: Evidence, Concept Mapping, Knowledge Level, Chemistry
Anderson, Daniel; Alonzo, Julie; Tindal, Gerald – Behavioral Research and Teaching, 2012
In this technical report, we describe the results of a study of mathematics items written to align with the Common Core State Standards (CCSS) in grades 6-8. In each grade, CCSS items were organized into forms, and the reliability of these forms was evaluated along with an experimental form including items aligned with the National Council of…
Descriptors: Curriculum Based Assessment, Mathematics Tests, Academic Standards, State Standards
Lakin, Joni M.; Lai, Emily R. – Educational and Psychological Measurement, 2012
For educators seeking to differentiate instruction, cognitive ability tests sampling multiple content domains, including verbal, quantitative, and nonverbal reasoning, provide superior information about student strengths and weaknesses compared with unidimensional reasoning measures. However, these ability tests have not been fully evaluated with…
Descriptors: Aptitude Tests, Nonverbal Ability, Cognitive Ability, Verbal Ability
Christ, Theodore J.; Riley-Tillman, T. Chris; Chafouleas, Sandra M.; Boice, Christina H. – Educational and Psychological Measurement, 2010
Generalizability theory was used to examine the generalizability and dependability of outcomes from two single-item Direct Behavior Rating (DBR) scales: DBR of actively manipulating and DBR of visually distracted. DBR is a behavioral assessment tool with specific instrumentation and procedures that can be used by a variety of service delivery…
Descriptors: Generalizability Theory, Student Behavior, Data Collection, Student Evaluation
Huang, Jinyan – TESOL Journal, 2011
Using generalizability theory, this study examined both the rating variability and reliability of English as a second language (ESL) students' writing in two provincial examinations in Canada. This article discusses expected and unexpected similarities and differences related to rating variability and reliability between the two testing programs.…
Descriptors: Foreign Countries, Generalizability Theory, Test Reliability, Testing Programs