Publication Date
In 2025 | 2 |
Since 2024 | 33 |
Since 2021 (last 5 years) | 134 |
Since 2016 (last 10 years) | 455 |
Since 2006 (last 20 years) | 1164 |
Descriptor
Comparative Analysis | 1930 |
Reliability | 873 |
Test Reliability | 787 |
Foreign Countries | 547 |
Test Validity | 442 |
Correlation | 345 |
Validity | 330 |
Interrater Reliability | 325 |
Statistical Analysis | 321 |
Scores | 274 |
Measures (Individuals) | 236 |
More ▼ |
Source
Author
Reckase, Mark D. | 6 |
Attali, Yigal | 5 |
Coniam, David | 5 |
Brennan, Robert L. | 4 |
Crehan, Kevin D. | 4 |
Feldt, Leonard S. | 4 |
Hakstian, A. Ralph | 4 |
Jones, Ian | 4 |
Kolen, Michael J. | 4 |
Lunz, Mary E. | 4 |
August, Diane | 3 |
More ▼ |
Publication Type
Education Level
Audience
Researchers | 35 |
Practitioners | 29 |
Teachers | 15 |
Administrators | 9 |
Policymakers | 6 |
Counselors | 2 |
Media Staff | 2 |
Parents | 1 |
Support Staff | 1 |
Location
Turkey | 59 |
United States | 47 |
Australia | 36 |
Canada | 32 |
United Kingdom (England) | 32 |
China | 31 |
United Kingdom | 28 |
Germany | 25 |
Netherlands | 24 |
Taiwan | 22 |
Hong Kong | 20 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards with or without Reservations | 1 |
Does not meet standards | 1 |
Ian Jones; Ben Davies – International Journal of Research & Method in Education, 2024
Educational researchers often need to construct precise and reliable measurement scales of complex and varied representations such as participants' written work, videoed lesson segments and policy documents. Developing such scales using can be resource-intensive and time-consuming, and the outcomes are not always reliable. Here we present…
Descriptors: Educational Research, Comparative Analysis, Educational Researchers, Measurement
Manjula Wijewickrema – portal: Libraries and the Academy, 2024
This research compares the performance measures reported by two bibliographic databases relevant to a set of authors who have published in predatory journals. The reliability of decision-making based on the information provided by uncontrolled bibliographic databases is examined to support rational decisions. A sample of authors who published in…
Descriptors: Periodicals, Ethics, Deception, Authors
Swapna Haresh Teckwani; Amanda Huee-Ping Wong; Nathasha Vihangi Luke; Ivan Cherh Chiet Low – Advances in Physiology Education, 2024
The advent of artificial intelligence (AI), particularly large language models (LLMs) like ChatGPT and Gemini, has significantly impacted the educational landscape, offering unique opportunities for learning and assessment. In the realm of written assessment grading, traditionally viewed as a laborious and subjective process, this study sought to…
Descriptors: Accuracy, Reliability, Computational Linguistics, Standards
Lucy Chambers; Emma Walland; Jo Ireland – Research Matters, 2024
Comparative Judgement (CJ) is traditionally and primarily used to compare written texts. In this study we explored whether we could extend its use to comparing audio files. We used GCSE Music portfolios which contained a mix of audio recordings, musical scores and text documents. Fifteen judges completed two exercises: one comparing musical…
Descriptors: Evaluative Thinking, Judges, Comparative Analysis, Reliability
Junjie, Ma; Yingxin, Ma – Online Submission, 2022
This paper aims to explore the philosophical theoretical foundations of two basic research paradigms, namely positivism and interpretivism. In the discussion process, literature in the relevant fields including academic papers and books is reviewed and used as support for the analysis. Firstly, the paper explores the differences between the…
Descriptors: Ideology, Bias, Credibility, Research Methodology
Kiri Mealings; Kelly Miles; Joerg M. Buchholz – International Journal of Listening, 2025
A child's ability to comprehend speech in the mainstream classroom is vital for intellectual and social development. However, listening conditions are often sub-optimal; the presence of multiple talkers, high noise levels, and long reverberation times add to the challenge of listening with a developing auditory system. An assessment that captures…
Descriptors: Elementary School Students, Listening Comprehension Tests, Comparative Analysis, Speech Communication
Lucy Chambers; Sylvia Vitello; Carmen Vidal Rodeiro – Assessment in Education: Principles, Policy & Practice, 2024
In England, some secondary-level qualifications comprise non-exam assessments which need to undergo moderation before grading. Currently, moderation is conducted at centre (school) level. This raises challenges for maintaining the standard across centres. Recent technological advances enable novel moderation methods that are no longer bound by…
Descriptors: Foreign Countries, Evaluation Methods, Comparative Analysis, Grading
Antonio P. Gutierrez de Blume; Diana Marcela Montoya Londoño; Virginia Jiménez Rodríguez; Olivia Morán Núñez; Ariel Cuadro; Lilián Daset; Mauricio Molina Delgado; Claudia García de la Cadena; María Beatríz Beltrán Navarro; Aníbal Puente Ferreras; Sebastián Urquijo; Walter Lizandro Arias – Metacognition and Learning, 2024
Metacognition is defined as a higher-order thinking skill that enables individuals to monitor, control, and regulate their thinking and behavior. In education, this skill is important, as learners need to self-regulate their learning behaviors for successful lifelong learning. Thus, it is essential for educators and learners alike to know their…
Descriptors: Metacognition, Measures (Individuals), Psychometrics, Standards
Yuan Tian; Xi Yang; Suhail A. Doi; Luis Furuya-Kanamori; Lifeng Lin; Joey S. W. Kwong; Chang Xu – Research Synthesis Methods, 2024
RobotReviewer is a tool for automatically assessing the risk of bias in randomized controlled trials, but there is limited evidence of its reliability. We evaluated the agreement between RobotReviewer and humans regarding the risk of bias assessment based on 1955 randomized controlled trials. The risk of bias in these trials was assessed via two…
Descriptors: Risk, Randomized Controlled Trials, Classification, Robotics
Qusai Khraisha; Sophie Put; Johanna Kappenberg; Azza Warraitch; Kristin Hadfield – Research Synthesis Methods, 2024
Systematic reviews are vital for guiding practice, research and policy, although they are often slow and labour-intensive. Large language models (LLMs) could speed up and automate systematic reviews, but their performance in such tasks has yet to be comprehensively evaluated against humans, and no study has tested Generative Pre-Trained…
Descriptors: Peer Evaluation, Research Reports, Artificial Intelligence, Computer Software
Rohlfing, Ingo – Field Methods, 2020
Empirical researchers using qualitative comparative analysis (QCA) can work with crisp, multivalue, and fuzzy sets. The relative advantages of crisp and multivalue sets have been discussed in the QCA literature. There has been little reflection on the more frequent decision between crisp and fuzzy sets for which there often is no theoretical…
Descriptors: Qualitative Research, Comparative Analysis, Reliability, Classification
Teck Kiang Tan – Practical Assessment, Research & Evaluation, 2024
The procedures of carrying out factorial invariance to validate a construct were well developed to ensure the reliability of the construct that can be used across groups for comparison and analysis, yet mainly restricted to the frequentist approach. This motivates an update to incorporate the growing Bayesian approach for carrying out the Bayesian…
Descriptors: Bayesian Statistics, Factor Analysis, Programming Languages, Reliability
Dadi Ramesh; Suresh Kumar Sanampudi – European Journal of Education, 2024
Automatic essay scoring (AES) is an essential educational application in natural language processing. This automated process will alleviate the burden by increasing the reliability and consistency of the assessment. With the advances in text embedding libraries and neural network models, AES systems achieved good results in terms of accuracy.…
Descriptors: Scoring, Essays, Writing Evaluation, Memory
Kinnear, George; Bennett, Max; Binnie, Rachel; Bolt, Róisín; Zheng, Yinglan – Teaching Mathematics and Its Applications, 2020
The MATH taxonomy classifies questions according to the mathematical skills required to answer them. It was created to aid the development of more balanced assessments in undergraduate mathematics and has since been used to compare different assessment regimes across school and university. To date, there has been no systematic investigation of the…
Descriptors: Taxonomy, Mathematics Instruction, Teaching Methods, Reliability
DeLuca, Stefanie – Sociological Methods & Research, 2023
Increasingly, the broader public, media and policymakers are looking to qualitative research to provide answers to our most pressing social questions. While an exciting and perhaps overdue moment for qualitative researchers, it is also a time when the method is coming under increasing scrutiny for a lack of reliability and transparency. The…
Descriptors: Qualitative Research, Reliability, Standards, Participant Observation