Publication Date
In 2025 | 2 |
Since 2024 | 33 |
Since 2021 (last 5 years) | 134 |
Since 2016 (last 10 years) | 455 |
Since 2006 (last 20 years) | 1164 |
Descriptor
Comparative Analysis | 1930 |
Reliability | 873 |
Test Reliability | 787 |
Foreign Countries | 547 |
Test Validity | 442 |
Correlation | 345 |
Validity | 330 |
Interrater Reliability | 325 |
Statistical Analysis | 321 |
Scores | 274 |
Measures (Individuals) | 236 |
More ▼ |
Source
Author
Reckase, Mark D. | 6 |
Attali, Yigal | 5 |
Coniam, David | 5 |
Brennan, Robert L. | 4 |
Crehan, Kevin D. | 4 |
Feldt, Leonard S. | 4 |
Hakstian, A. Ralph | 4 |
Jones, Ian | 4 |
Kolen, Michael J. | 4 |
Lunz, Mary E. | 4 |
August, Diane | 3 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 35 |
Practitioners | 29 |
Teachers | 15 |
Administrators | 9 |
Policymakers | 6 |
Counselors | 2 |
Media Staff | 2 |
Parents | 1 |
Support Staff | 1 |
Location
Turkey | 59 |
United States | 47 |
Australia | 36 |
Canada | 32 |
United Kingdom (England) | 32 |
China | 31 |
United Kingdom | 28 |
Germany | 25 |
Netherlands | 24 |
Taiwan | 22 |
Hong Kong | 20 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards with or without Reservations | 1 |
Does not meet standards | 1 |
Andrew R. Thompson – Advances in Physiology Education, 2024
The revised two-factor Study Process Questionnaire and the Approaches and Study Skills Inventory for Students are two instruments commonly used to measure student learning approach. Although they are designed to measure similar constructs, it is unclear whether the metrics they provide differ in terms of their real-world classification of learning…
Descriptors: Comparative Analysis, Anatomy, Classification, Cognitive Style
Fu, Yuanshu; Wen, Zhonglin; Wang, Yang – Educational and Psychological Measurement, 2022
Composite reliability, or coefficient omega, can be estimated using structural equation modeling. Composite reliability is usually estimated under the basic independent clusters model of confirmatory factor analysis (ICM-CFA). However, due to the existence of cross-loadings, the model fit of the exploratory structural equation model (ESEM) is…
Descriptors: Comparative Analysis, Structural Equation Models, Factor Analysis, Reliability
Brian Weiler; Ling-Yu Guo – Language, Speech, and Hearing Services in Schools, 2024
Purpose: The finite verb morphology composite (FVMC) is a valid measure for charting children's tense development and for differentiating children with and without language impairment during preschool and early elementary years. However, it is unclear whether FVMC scores vary as a function of language sample elicitation contexts. The current study…
Descriptors: Verbs, Preschool Children, Morphology (Languages), Accuracy
Crompvoets, Elise A. V.; Béguin, Anton A.; Sijtsma, Klaas – Journal of Educational and Behavioral Statistics, 2020
Pairwise comparison is becoming increasingly popular as a holistic measurement method in education. Unfortunately, many comparisons are required for reliable measurement. To reduce the number of required comparisons, we developed an adaptive selection algorithm (ASA) that selects the most informative comparisons while taking the uncertainty of the…
Descriptors: Comparative Analysis, Statistical Analysis, Mathematics, Measurement
Grajzel, Katalin; Dumas, Denis; Acar, Selcuk – Journal of Creative Behavior, 2022
One of the best-known and most frequently used measures of creative idea generation is the Torrance Test of Creative Thinking (TTCT). The TTCT Verbal, assessing verbal ideation, contains two forms created to be used interchangeably by researchers and practitioners. However, the parallel forms reliability of the two versions of the TTCT Verbal has…
Descriptors: Test Reliability, Creative Thinking, Creativity Tests, Verbal Ability
Lind, Veronika; Svensson, Melanie; Harringe, Marita L. – Measurement in Physical Education and Exercise Science, 2022
Goniometry is commonly used to evaluate joint range of motion (ROM). The most widespread method, a manual universal goniometer (UG), is considered time-consuming and difficult to handle. The digital goniometer EasyAngle (EA) was developed to improve and simplify the evaluation of ROM. This study aimed to evaluate the reliability and validity of EA…
Descriptors: Motor Reactions, Measurement Techniques, Comparative Analysis, Measurement Equipment
Damian, Elena; Meuleman, Bart; van Oorschot, Wim – Sociological Methods & Research, 2022
In this article, we examine whether cross-national studies disclose enough information for independent researchers to evaluate the validity and reliability of the findings (evaluation transparency) or to perform a direct replication (replicability transparency). The first contribution is theoretical. We develop a heuristic theoretical model…
Descriptors: National Surveys, Cross Cultural Studies, Social Science Research, Periodicals
Wind, Stefanie A. – Measurement: Interdisciplinary Research and Perspectives, 2022
In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the…
Descriptors: Evaluators, Bias, Identification, Performance Based Assessment
Jayden J. Lee – ProQuest LLC, 2022
The functional neuroanatomy of language localization in dyslexia has primarily been studied in the context of reading. However, dyslexia is sometimes referred to as a "language-based learning disability," yet the functional signature of the core language comprehension network in dyslexia is far less understood. This thesis presents a…
Descriptors: Dyslexia, Brain Hemisphere Functions, Comparative Analysis, Speech Communication
Fatih Yavuz; Özgür Çelik; Gamze Yavas Çelik – British Journal of Educational Technology, 2025
This study investigates the validity and reliability of generative large language models (LLMs), specifically ChatGPT and Google's Bard, in grading student essays in higher education based on an analytical grading rubric. A total of 15 experienced English as a foreign language (EFL) instructors and two LLMs were asked to evaluate three student…
Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Computational Linguistics
Hunter, Seth B. – Journal of Education Human Resources, 2023
Teacher performance scores inform education leaders' management of teacher human resources. However, prior research has implied that different interpretations of performance criteria between teachers and their evaluators suppress teacher development. Although research has examined teacher perceptions of performance scores and compared teacher…
Descriptors: Teacher Evaluation, Teacher Effectiveness, Self Evaluation (Individuals), Interrater Reliability
Alain Bengochea; Sabrina F. Sembiante – Review of Education, 2024
This best-evidence synthesis appraises the design and outcome characteristics of vocabulary intervention studies conducted with preschool through 6th grade emergent bilingual (EB) children and spotlights rigorously designed studies for which effects could be better attributed to instructional features. Twenty-nine selected studies were analysed…
Descriptors: Bilingualism, Vocabulary Development, Intervention, Comparative Analysis
Saluja, Ronak; Cheng, Sierra; delos Santos, Keemo Althea; Chan, Kelvin K. W. – Research Synthesis Methods, 2019
Objective: Various statistical methods have been developed to estimate hazard ratios (HRs) from published Kaplan-Meier (KM) curves for the purpose of performing meta-analyses. The objective of this study was to determine the reliability, accuracy, and precision of four commonly used methods by Guyot, Williamson, Parmar, and Hoyle and Henley.…
Descriptors: Meta Analysis, Reliability, Accuracy, Randomized Controlled Trials
Gill, Tim – Research Matters, 2022
In Comparative Judgement (CJ) exercises, examiners are asked to look at a selection of candidate scripts (with marks removed) and order them in terms of which they believe display the best quality. By including scripts from different examination sessions, the results of these exercises can be used to help with maintaining standards. Results from…
Descriptors: Comparative Analysis, Decision Making, Scripts, Standards
Yun Long; Haifeng Luo; Yu Zhang – npj Science of Learning, 2024
This study explores the use of Large Language Models (LLMs), specifically GPT-4, in analysing classroom dialogue--a key task for teaching diagnosis and quality improvement. Traditional qualitative methods are both knowledge- and labour-intensive. This research investigates the potential of LLMs to streamline and enhance this process. Using…
Descriptors: Classroom Communication, Computational Linguistics, Chinese, Mathematics Instruction