Publication Date
In 2025 | 2 |
Since 2024 | 15 |
Since 2021 (last 5 years) | 68 |
Since 2016 (last 10 years) | 171 |
Since 2006 (last 20 years) | 439 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 28 |
Practitioners | 2 |
Policymakers | 1 |
Students | 1 |
Location
Turkey | 14 |
Canada | 10 |
United States | 10 |
California | 9 |
Netherlands | 9 |
Australia | 6 |
Germany | 6 |
South Korea | 6 |
Iowa | 5 |
Norway | 5 |
Turkey (Ankara) | 5 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 2 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Irby, Sarah M.; Floyd, Randy G. – Psychology in the Schools, 2017
This study examined the exchangeability of total scores (i.e., intelligent quotients [IQs]) from three brief intelligence tests. Tests were administered to 36 children with intellectual giftedness, scored live by one set of primary examiners and later scored by a secondary examiner. For each student, six IQs were calculated, and all 216 values…
Descriptors: Intelligence Tests, Gifted, Error of Measurement, Scores
Keller, Lena; Preckel, Franzis; Brunner, Martin – Journal of Educational Psychology, 2021
It is well-documented that academic achievement is associated with students' self-perceptions of their academic abilities, that is, their academic self-concepts. However, low-achieving students may apply self-protective strategies to maintain a favorable academic self-concept when evaluating their academic abilities. Consequently, the relation…
Descriptors: Correlation, Academic Achievement, High Achievement, Low Achievement
Zaidi, Nikki L.; Swoboda, Christopher M.; Kelcey, Benjamin M.; Manuel, R. Stephen – Advances in Health Sciences Education, 2017
The extant literature has largely ignored a potentially significant source of variance in multiple mini-interview (MMI) scores by "hiding" the variance attributable to the sample of attributes used on an evaluation form. This potential source of hidden variance can be defined as rating items, which typically comprise an MMI evaluation…
Descriptors: Interviews, Scores, Generalizability Theory, Monte Carlo Methods
Byram, Jessica N.; Seifert, Mark F.; Brooks, William S.; Fraser-Cotlin, Laura; Thorp, Laura E.; Williams, James M.; Wilson, Adam B. – Anatomical Sciences Education, 2017
With integrated curricula and multidisciplinary assessments becoming more prevalent in medical education, there is a continued need for educational research to explore the advantages, consequences, and challenges of integration practices. This retrospective analysis investigated the number of items needed to reliably assess anatomical knowledge in…
Descriptors: Anatomy, Science Tests, Test Items, Test Reliability
Uzun, N. Bilge; Aktas, Mehtap; Asiret, Semih; Yormaz, Seha – Asian Journal of Education and Training, 2018
The goal of this study is to determine the reliability of the performance points of dentistry students regarding communication skills and to examine the scoring reliability by generalizability theory in balanced random and fixed facet (mixed design) data, considering also the interactions of student, rater and duty. The study group of the research…
Descriptors: Foreign Countries, Generalizability Theory, Scores, Test Reliability
Matcha, Wannisa; Gasevic, Dragan; Uzir, Nora'ayu Ahmad; Jovanovic, Jelena; Pardo, Abelardo; Lim, Lisa; Maldonado-Mahauad, Jorge; Gentili, Sheridan; Perez-Sanagustin, Mar; Tsai, Yi-Shan – Journal of Learning Analytics, 2020
Generalizability of the value of methods based on learning analytics remains one of the big challenges in the field of learning analytics. One approach to testing generalizability of a method is to apply it consistently in different learning contexts. This study extends a previously published work by examining the generalizability of a learning…
Descriptors: Learning Analytics, Learning Strategies, Instructional Design, Delivery Systems
Martínez, José Felipe; Kloser, Matt; Srinivasan, Jayashri; Stecher, Brian; Edelman, Amanda – Educational and Psychological Measurement, 2022
Adoption of new instructional standards in science demands high-quality information about classroom practice. Teacher portfolios can be used to assess instructional practice and support teacher self-reflection anchored in authentic evidence from classrooms. This study investigated a new type of electronic portfolio tool that allows efficient…
Descriptors: Science Instruction, Academic Standards, Instructional Innovation, Electronic Publishing
Harrison, George M. – Journal of Educational Measurement, 2015
The credibility of standard-setting cut scores depends in part on two sources of consistency evidence: intrajudge and interjudge consistency. Although intrajudge consistency feedback has often been provided to Angoff judges in practice, more evidence is needed to determine whether it achieves its intended effect. In this randomized experiment with…
Descriptors: Interrater Reliability, Standard Setting (Scoring), Cutting Scores, Feedback (Response)
Clauser, Jerome C.; Margolis, Melissa J.; Clauser, Brian E. – Journal of Educational Measurement, 2014
Evidence of stable standard setting results over panels or occasions is an important part of the validity argument for an established cut score. Unfortunately, due to the high cost of convening multiple panels of content experts, standards often are based on the recommendation from a single panel of judges. This approach implicitly assumes that…
Descriptors: Standard Setting (Scoring), Generalizability Theory, Replication (Evaluation), Cutting Scores
Volpe, Robert J.; Briesch, Amy M. – School Psychology Review, 2016
This study examines the dependability of two scaling approaches for using a five-item Direct Behavior Rating multi-item scale to assess student disruptive behavior. A series of generalizability theory studies were used to compare a traditional frequency-based scaling approach with an approach wherein the informant compares a target student's…
Descriptors: Scaling, Behavior Rating Scales, Behavior Problems, Student Behavior
Gaertner, Holger; Brunner, Martin – Educational Assessment, Evaluation and Accountability, 2018
In many countries, students are asked about their perceptions of teaching in order to make decisions about the further development of teaching practices on the basis of this feedback. The stability of this measurement of teaching quality is a prerequisite for the ability to generalize the results to other teaching situations. The present study…
Descriptors: Student Attitudes, Educational Attitudes, Teacher Effectiveness, Student Evaluation of Teacher Performance
Wickerd, Garry; Hulac, David – Journal of Applied School Psychology, 2017
Accurate and rapid identification of students displaying behavioral problems requires instrumentation that is user friendly and reliable. The purpose of the study was to evaluate a multi-item direct behavior rating scale called the Direct Behavior Rating-Multiple Item Scale (DBR-MIS) for disruptive behavior to determine the number of…
Descriptors: Behavior Rating Scales, Kindergarten, Behavior Problems, Young Children
McLaughlin, Tara W.; Snyder, Patricia A.; Algina, James – Grantee Submission, 2017
The Learning Target Rating Scale (LTRS) is a measure designed to evaluate the quality of teacher-developed learning targets for embedded instruction for early learning. In the present study, we examined the measurement dependability of LTRS scores by conducting a generalizability study (G-study). We used a partially nested, three-facet model to…
Descriptors: Generalizability Theory, Scores, Rating Scales, Evaluation Methods
DeMars, Christine – Applied Measurement in Education, 2015
In generalizability theory studies in large-scale testing contexts, sometimes a facet is very sparsely crossed with the object of measurement. For example, when assessments are scored by human raters, it may not be practical to have every rater score all students. Sometimes the scoring is systematically designed such that the raters are…
Descriptors: Educational Assessment, Measurement, Data, Generalizability Theory
Jensen, Bryant; Grajeda, Sara; Haertel, Edward – Educational Assessment, 2018
We trace the development and analyze the generalizability of the Classroom Assessment of Sociocultural Interactions (CASI), an observation system designed to measure cultural dimensions of classroom interactions. We establish CASI measurement properties by analyzing panoramic videos of 4th and 5th grade classrooms from the Measures of Effective…
Descriptors: Classroom Observation Techniques, Grade 4, Grade 5, Error of Measurement