Publication Date
In 2025 | 3 |
Since 2024 | 27 |
Since 2021 (last 5 years) | 92 |
Since 2016 (last 10 years) | 265 |
Since 2006 (last 20 years) | 1867 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 95 |
Practitioners | 84 |
Teachers | 32 |
Administrators | 28 |
Policymakers | 15 |
Counselors | 5 |
Community | 2 |
Parents | 2 |
Students | 2 |
Media Staff | 1 |
Support Staff | 1 |
More ▼ |
Location
Australia | 62 |
United Kingdom | 49 |
Canada | 47 |
United States | 44 |
California | 41 |
United Kingdom (England) | 30 |
Turkey | 29 |
Florida | 26 |
Texas | 26 |
China | 25 |
Taiwan | 24 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 5 |
Meets WWC Standards with or without Reservations | 5 |
John R. Donoghue; Carol Eckerly – Applied Measurement in Education, 2024
Trend scoring constructed response items (i.e. rescoring Time A responses at Time B) gives rise to two-way data that follow a product multinomial distribution rather than the multinomial distribution that is usually assumed. Recent work has shown that the difference in sampling model can have profound negative effects on statistics usually used to…
Descriptors: Scoring, Error of Measurement, Reliability, Scoring Rubrics
Wiebe Koopal – Studies in Philosophy and Education, 2024
In this paper I try to 'rethink' consistency as an educational quality for the 3rd millennium, following Italo Calvino's choice to take it up in his lecture series Memos for the Next Millennium, and despite the fact that the (final) lecture devoted to this quality remained unwritten. After reflecting on how consistency already plays a certain role…
Descriptors: Reliability, Education, Instruction, Lecture Method
Constructing a Roadmap to Measure the Quality of Business Assessments Aimed at Curriculum Management
Silva, Thanuci; Santos, Regiane dos; Mallet, Débora – Journal of Education for Business, 2023
Assuring the quality of education is a concern of learning institutions. To do so, it is necessary to have assertive learning management, with consistent data on students' outcomes. This research provides associate deans and researchers, a roadmap with which to gather evidence to improve the quality of open-ended assessments. Based on statistical…
Descriptors: Student Evaluation, Evaluation Methods, Business Education, Higher Education
Pearson, Terry – FORUM: for promoting 3-19 comprehensive education, 2023
Ofsted has frequently defended the judgements made during inspections by claiming that inspection ratings are reliable, as shown by the results from the collection of studies the inspectorate has conducted. I outline the inspectorate's view of reliability and problematise the studies that it has carried out, noting that these provide insufficient…
Descriptors: Inspection, Interrater Reliability, Decision Making, Value Judgment
Tavares, Walter; Kinnear, Benjamin; Schumacher, Daniel J.; Forte, Milena – Advances in Health Sciences Education, 2023
In this perspective, the authors critically examine "rater training" as it has been conceptualized and used in medical education. By "rater training," they mean the educational events intended to "improve" rater performance and contributions during assessment events. Historically, rater training programs have focused…
Descriptors: Medical Education, Interrater Reliability, Evaluation Methods, Training
Bennett L. Schwartz – Metacognition and Learning, 2024
Retrospective confidence refers to the phenomenological experience of the level of certainty that retrieved information is, in fact, correct. Retrospective confidence judgments are examined across a range of sub-disciplines in psychology from perception to memory research, and in education and legal applications. This paper focuses on…
Descriptors: Memory, Recall (Psychology), Cues, Learning Processes
Bonett, Douglas G. – Journal of Educational and Behavioral Statistics, 2022
The limitations of Cohen's ? are reviewed and an alternative G-index is recommended for assessing nominal-scale agreement. Maximum likelihood estimates, standard errors, and confidence intervals for a two-rater G-index are derived for one-group and two-group designs. A new G-index of agreement for multirater designs is proposed. Statistical…
Descriptors: Statistical Inference, Statistical Data, Interrater Reliability, Design
Atli Harðarson – Educational Philosophy and Theory, 2024
This paper has two aims. One is to draw a distinction between two types of trust. The other is to argue for its applicability in academic discourse on educational policies. One of the two types of trust is "ethical trust" that rests on beliefs about others' ethical virtues. The other is "institutional trust" that typically…
Descriptors: Trust (Psychology), Ethics, Reliability, Schools
Tsangaridou, Niki; Charalambous, Charalambos Y. – Quest, 2023
Focusing on systematic observation, one of the most potent methods of studying teaching quality, represents one of the numerous contributions of Daryl Siedentop to the profession. While he had a clear focus on issues of validity and reliability concerning systematic observation, over the past decades, attention to such issues appears to have…
Descriptors: Physical Education Teachers, Observation, Validity, Reliability
Courtney Bell; Jessalynn James; Eric S. Taylor; James Wyckoff – Journal of Policy Analysis and Management, 2025
We study the returns to experience in teaching, estimated using supervisor ratings from classroom observations. We describe the assumptions required to interpret changes in observation ratings over time as the causal effect of experience on performance. We compare two difference-in-differences strategies: the two-way fixed effects estimator common…
Descriptors: Lesson Observation Criteria, Teaching Experience, Teacher Evaluation, Supervisors
Robert H. Eaglen; Steven J. Durning; Holly S. Meyer; Christopher S. Candler – Quality in Higher Education, 2024
Higher education accreditation has spread internationally as a vehicle for quality assurance and improvement but is strongly influenced by accreditation practices in the United States. The organisational structure and processes of seven United States health professions accreditors were analysed to identify common characteristics that reflect…
Descriptors: Accreditation (Institutions), Quality Assurance, Evaluators, Evaluation Methods
Gitomer, Drew H.; Martínez, José Felipe; Battey, Dan; Hyland, Nora E. – American Educational Research Journal, 2021
The Educative Teacher Performance Assessment (edTPA) is a system of standardized portfolio assessments of teaching performance mandated for use by educator preparation programs in 18 states, and approved in 21 others, as part of initial certification for preservice teachers. Because of the high stakes involved for examinees, it is critical that…
Descriptors: Evaluation, Performance Based Assessment, Test Reliability, Test Validity
Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2021
The population discrepancy between unstandardized and standardized reliability of homogeneous multicomponent measuring instruments is examined. Within a latent variable modeling framework, it is shown that the standardized reliability coefficient for unidimensional scales can be markedly higher than the corresponding unstandardized reliability…
Descriptors: Test Reliability, Computation, Measures (Individuals), Research Problems
Junjie, Ma; Yingxin, Ma – Online Submission, 2022
This paper aims to explore the philosophical theoretical foundations of two basic research paradigms, namely positivism and interpretivism. In the discussion process, literature in the relevant fields including academic papers and books is reviewed and used as support for the analysis. Firstly, the paper explores the differences between the…
Descriptors: Ideology, Bias, Credibility, Research Methodology
Kroc, Edward; Olvera Astivia, Oscar L. – Educational and Psychological Measurement, 2022
Setting cutoff scores is one of the most common practices when using scales to aid in classification purposes. This process is usually done univariately where each optimal cutoff value is decided sequentially, subscale by subscale. While it is widely known that this process necessarily reduces the probability of "passing" such a test,…
Descriptors: Multivariate Analysis, Cutting Scores, Classification, Measurement