ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	3
Since 2006 (last 20 years)	22

Descriptor

Evaluation Methods	61
Tests	61
Test Reliability	42
Test Validity	27
Student Evaluation	15
Reliability	14
Measurement Techniques	11
Test Construction	11
Scores	9
Statistical Analysis	9
Testing	9
Evaluation Criteria	8
Interrater Reliability	8
Correlation	7
Research Methodology	7
Student Attitudes	7
Comparative Analysis	6
Educational Research	6
Foreign Countries	6
Measurement	6
Psychometrics	6
Test Interpretation	6
Test Selection	6
Validity	6
Academic Achievement	5
More ▼

Publication Type

Journal Articles	24
Reports - Research	20
Guides - Non-Classroom	8
Reports - Descriptive	5
Dissertations/Theses -…	4
Reports - Evaluative	4
Opinion Papers	2
Reference Materials -…	2
Books	1
Collected Works - General	1
Collected Works - Proceedings	1
Collected Works - Serials	1
ERIC Digests in Full Text	1
ERIC Publications	1
Guides - General	1
Information Analyses	1
Speeches/Meeting Papers	1
More ▼

Education Level

Higher Education	8
Elementary Education	3
Elementary Secondary Education	3
Postsecondary Education	3
Early Childhood Education	1
Middle Schools	1
Secondary Education	1

Audience

Practitioners	6
Teachers	5
Administrators	3
Researchers	1

Location

Australia	2
United Kingdom (England)	2
United Kingdom (Scotland)	2
Asia	1
Brazil	1
California	1
Connecticut	1
Denmark	1
Egypt	1
Estonia	1
Florida	1
Germany	1
Greece	1
Hawaii	1
Ireland	1
Israel	1
Italy	1
Japan	1
Kazakhstan	1
Netherlands	1
Nigeria	1
Norway	1
Ohio	1
Pakistan	1
Pennsylvania	1
More ▼

Laws, Policies, & Programs

Elementary and Secondary…

Assessments and Surveys

Marlowe Crowne Social…	1
SAT (College Admission Test)	1
Self Directed Search	1

What Works Clearinghouse Rating

Showing 1 to 15 of 61 results Save | Export

Pedagogical Considerations for Examining Rater Variability in Rater-Mediated Assessments: A Three-Model Framework

Peer reviewed

Direct link

Wesolowski, Brian C.; Wind, Stefanie A. – Journal of Educational Measurement, 2019

Rater-mediated assessments are a common methodology for measuring persons, investigating rater behavior, and/or defining latent constructs. The purpose of this article is to provide a pedagogical framework for examining rater variability in the context of rater-mediated assessments using three distinct models. The first model is the observation…

Descriptors: Interrater Reliability, Models, Observation, Measurement

Multidimensional Balance in Youth with Visual Impairments

Direct link

Pennell, Adam – ProQuest LLC, 2019

This dissertation consists of three studies which examined multidimensional balance in youth (= 21 years; Individuals with Disabilities Education Act, 2004) with visual impairments (VIs) using the Brief-Balance Evaluation Systems Test (Brief-BESTest). These studies have the potential to inform (adapted) physical education curricula and…

Descriptors: Psychomotor Skills, Youth, Visual Impairments, Human Posture

Can Student Self-Ratings Be Compared with Peer Ratings? A Study of Measurement Invariance of Multisource Feedback

Peer reviewed

Direct link

Lee, Keng-Lin; Tsai, Shih-Li; Chiu, Yu-Ting; Ho, Ming-Jung – Advances in Health Sciences Education, 2016

Measurement invariance is a prerequisite for comparing measurement scores from different groups. In medical education, multi-source feedback (MSF) is utilized to assess core competencies, including the professionalism. However, little attention has been paid to the measurement invariance of assessment instruments; that is, whether an instrument…

Descriptors: Measurement, Scores, Medical Education, Competence

Methods for Examining the Psychometric Quality of Subscores: A Review and Application

Peer reviewed
PDF on ERIC

Download full text

Wedman, Jonathan; Lyrén, Per-Erik – Practical Assessment, Research & Evaluation, 2015

When subscores on a test are reported to the test taker, the appropriateness of reporting them depends on whether they provide useful information above what is provided by the total score. Subscores that fail to do so lack adequate psychometric quality and should not be reported. There are several methods for examining the quality of subscores,…

Descriptors: Evaluation Methods, Psychometrics, Scores, Tests

Instruments Assessing Anxiety in Adults with Intellectual Disabilities: A Systematic Review

Peer reviewed

Direct link

Hermans, Heidi; van der Pas, Femke H.; Evenhuis, Heleen M. – Research in Developmental Disabilities: A Multidisciplinary Journal, 2011

Background: In the last decades several instruments measuring anxiety in adults with intellectual disabilities have been developed. Aim: To give an overview of the characteristics and psychometric properties of self-report and informant-report instruments measuring anxiety in this group. Method: Systematic review of the literature. Results:…

Descriptors: Mental Retardation, Learning Disabilities, Interrater Reliability, Measures (Individuals)

A Simulation Study of the Situations in Which Reporting Subscores Can Add Value to Licensure Examinations

Direct link

Feinberg, Richard A. – ProQuest LLC, 2012

Subscores, also known as domain scores, diagnostic scores, or trait scores, can help determine test-takers' relative strengths and weaknesses and appropriately focus remediation. However, subscores often have poor psychometric properties, particularly reliability and distinctiveness (Folske, Gessaroli, & Swanson, 1999; Monaghan, 2006;…

Descriptors: Simulation, Tests, Testing, Scores

Reliability Generalization: "Lapsus Linguae"

Direct link

Smith, Julie M. – ProQuest LLC, 2011

This study examines the proposed Reliability Generalization (RG) method for studying reliability. RG employs the application of meta-analytic techniques similar to those used in validity generalization studies to examine reliability coefficients. This study explains why RG does not provide a proper research method for the study of reliability,…

Descriptors: Reliability, Generalization, Sampling, Research Methodology

The Relationship between Mean Square Differences and Standard Error of Measurement: Comment on Barchard (2012)

Peer reviewed

Direct link

Pan, Tianshu; Yin, Yue – Psychological Methods, 2012

In the discussion of mean square difference (MSD) and standard error of measurement (SEM), Barchard (2012) concluded that the MSD between 2 sets of test scores is greater than 2(SEM)[superscript 2] and SEM underestimates the score difference between 2 tests when the 2 tests are not parallel. This conclusion has limitations for 2 reasons. First,…

Descriptors: Error of Measurement, Geometric Concepts, Tests, Structural Equation Models

Speaker Reliability Guides Children's Inductive Inferences about Novel Properties

Peer reviewed

Direct link

Kim, Sunae; Kalish, Charles W.; Harris, Paul L. – Cognitive Development, 2012

Prior work shows that children can make inductive inferences about objects based on their labels rather than their appearance (Gelman, 2003). A separate line of research shows that children's trust in a speaker's label is selective. Children accept labels from a reliable speaker over an unreliable speaker (e.g., Koenig & Harris, 2005). In the…

Descriptors: Logical Thinking, Inferences, Classification, Young Children

The Validation of a Food Label Literacy Questionnaire for Elementary School Children

Peer reviewed

Direct link

Reynolds, Jesse S.; Treu, Judith A.; Njike, Valentine; Walker, Jennifer; Smith, Erica; Katz, Catherine S.; Katz, David L. – Journal of Nutrition Education and Behavior, 2012

Objective: To determine the reliability and validity of a 10-item questionnaire, the Food Label Literacy for Applied Nutrition Knowledge questionnaire. Methods: Participants were elementary school children exposed to a 90-minute school-based nutrition program. Reliability was assessed via Cronbach alpha and intraclass correlation coefficient…

Descriptors: Elementary School Students, Age, Nutrition, Validity

Confidence Bounds and Power for the Reliability of Observational Measures on the Quality of a Social Setting

Peer reviewed

Direct link

Shin, Yongyun; Raudenbush, Stephen W. – Psychometrika, 2012

Social scientists are frequently interested in assessing the qualities of social settings such as classrooms, schools, neighborhoods, or day care centers. The most common procedure requires observers to rate social interactions within these settings on multiple items and then to combine the item responses to obtain a summary measure of setting…

Descriptors: Generalizability Theory, Neighborhoods, Intervals, Child Care Centers

Psychometric Analysis of a Measure of Socio-Emotional Development in Adolescents

Peer reviewed

Direct link

Sandell, Rolf; Kimber, Birgitta; Andersson, Marie; Elg, Mattias; Fharm, Linus; Gustafsson, Niklas; Soderbaum, Wendela – Educational Psychology in Practice, 2012

This is a psychometric analysis of an instrument to assess the socio-emotional development of school students, How I Feel (HIF), developed as a situational judgment test, with scoring based on expert judgments. The HIF test was administered in grades 4-9, 1999-2005. Internal consistency, retest reliability, and year-to-year stability were…

Descriptors: Evaluation Methods, Emotional Development, Psychometrics, Construct Validity

Who Should Mark What? A Study of Factors Affecting Marking Accuracy in a Biology Examination

Peer reviewed

Direct link

Suto, Irenka; Nadas, Rita; Bell, John – Research Papers in Education, 2011

Accurate marking is crucial to the reliability and validity of public examinations, in England and internationally. Factors contributing to accuracy have been conceptualised as affecting either marking task demands or markers' personal expertise. The aim of this empirical study was to develop this conceptualisation through investigating the…

Descriptors: Academic Achievement, Examiners, Biology, Foreign Countries

Two Models of Raters in a Structured Oral Examination: Does It Make a Difference?

Peer reviewed

Direct link

Touchie, Claire; Humphrey-Murto, Susan; Ainslie, Martha; Myers, Kathryn; Wood, Timothy J. – Advances in Health Sciences Education, 2010

Oral examinations have become more standardized over recent years. Traditionally a small number of raters were used for this type of examination. Past studies suggested that more raters should improve reliability. We compared the results of a multi-station structured oral examination using two different rater models, those based in a station,…

Descriptors: Interrater Reliability, Internal Medicine, Evaluation Methods, Tests

Is Case-Specificity Content-Specificity? An Analysis of Data from Extended-Matching Questions

Peer reviewed

Direct link

Dory, Valerie; Gagnon, Robert; Charlin, Bernard – Advances in Health Sciences Education, 2010

Case-specificity, i.e., variability of a subject's performance across cases, has been a consistent finding in medical education. It has important implications for assessment validity and reliability. Its root causes remain a matter of discussion. One hypothesis, content-specificity, links variability of performance to variable levels of relevant…

Descriptors: Medical Education, Trainees, English (Second Language), Error of Measurement

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5

ProQuest LLC	4
Advances in Health Sciences…	3
Alberta Journal of…	1
Applied Psychological…	1
Assessment Update	1
Cognitive Development	1
College and Research Libraries	1
Communication Education	1
Educational Psychology in…	1
International Association for…	1
Jossey-Bass, An Imprint of…	1
Journal of Educational…	1
Journal of Higher Education	1
Journal of Nutrition…	1
Journal of Social Studies…	1
Journal of Speech, Language,…	1
Journal of Strength and…	1
Journal of Urban Learning,…	1
Measurement in Physical…	1
Music Educators Journal	1
Practical Assessment,…	1
Psychological Methods	1
Psychometrika	1
Reading Teacher	1
Research Papers in Education	1
More ▼

Ainslie, Martha	1
Anderson, Colette	1
Andersson, Marie	1
Bell, John	1
Benavidez, Charlotte	1
Blanchard, Jay S.	1
Bloch, Barbara	1
Bochner, Arthur P.	1
Bothe, Anne K.	1
Bramlett, Robin E.	1
Burmester, Kristen O'Rourke	1
CROMWELL, RUE L.	1
Caldwell, Edward	1
Carlson, Robert E.	1
Casto, Glendon	1
Charlin, Bernard	1
Chiu, Yu-Ting	1
Cohen, Allan S., Comp.	1
Dory, Valerie	1
Dunivant, Noel	1
Elg, Mattias	1
Evenhuis, Heleen M.	1
Feinberg, Richard A.	1
Fharm, Linus	1
More ▼