ERIC - Search Results

Publication Date

In 2025	1
Since 2024	1
Since 2021 (last 5 years)	1
Since 2016 (last 10 years)	2
Since 2006 (last 20 years)	17

Descriptor

Evaluation Methods	39
Sampling	39
Reliability	20
Research Methodology	16
Test Reliability	16
Statistical Analysis	13
Test Validity	11
Validity	11
Data Analysis	10
Research Design	10
Data Collection	9
Measurement Techniques	9
Program Evaluation	7
Educational Research	6
Error of Measurement	6
Questionnaires	6
Test Construction	6
Correlation	5
Research Problems	5
Scores	5
Student Evaluation	5
Computation	4
Elementary Secondary Education	4
Evaluation Criteria	4
Observation	4
More ▼

Publication Type

Journal Articles	21
Reports - Research	14
Reports - Descriptive	8
Reports - Evaluative	5
Guides - Non-Classroom	4
Speeches/Meeting Papers	3
Books	2
Information Analyses	2
Opinion Papers	2
Tests/Questionnaires	2
Dissertations/Theses -…	1
Reference Materials -…	1
More ▼

Education Level

Higher Education	5
Postsecondary Education	4
Elementary Secondary Education	1
High Schools	1
Secondary Education	1

Audience

Researchers	4
Practitioners	2
Students	2
Media Staff	1
Policymakers	1

Location

United States	2
Australia	1
Colorado (Denver)	1
Europe	1
Florida	1
New York	1
North Carolina (Charlotte)	1
Pennsylvania (Pittsburgh)	1
Tennessee (Memphis)	1
United Kingdom	1

Laws, Policies, & Programs

Elementary and Secondary…

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing 1 to 15 of 39 results Save | Export

Linear and Nonlinear Indices of Score Accuracy and Item Effectiveness for Measures That Contain Locally Dependent Items

Peer reviewed

Direct link

Pere J. Ferrando; David Navarro-González; Fabia Morales-Vives – Educational and Psychological Measurement, 2025

The problem of local item dependencies (LIDs) is very common in personality and attitude measures, particularly in those that measure narrow-bandwidth dimensions. At the structural level, these dependencies can be modeled by using extended factor analytic (FA) solutions that include correlated residuals. However, the effects that LIDs have on the…

Descriptors: Scores, Accuracy, Evaluation Methods, Factor Analysis

Scale Reliability Evaluation with Heterogeneous Populations

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – Educational and Psychological Measurement, 2015

A latent variable modeling approach for scale reliability evaluation in heterogeneous populations is discussed. The method can be used for point and interval estimation of reliability of multicomponent measuring instruments in populations representing mixtures of an unknown number of latent classes or subpopulations. The procedure is helpful also…

Descriptors: Test Reliability, Evaluation Methods, Measurement Techniques, Computation

Student Satisfaction with Online Learning: Is It a Psychological Contract?

Peer reviewed
PDF on ERIC

Download full text

Dziuban, Charles; Moskal, Patsy; Thompson, Jessica; Kramer, Lauren; DeCantis, Genevieve; Hermsdorfer, Andrea – Online Learning, 2015

The authors explore the possible relationship between student satisfaction with online learning and the theory of psychological contracts. The study incorporates latent trait models using the image analysis procedure and computation of Anderson and Rubin factors scores with contrasts for students who are satisfied, ambivalent, or dissatisfied with…

Descriptors: Student Attitudes, Online Courses, Scores, Learning Experience

Making Sense of Learning Gain in Higher Education

Peer reviewed

Direct link

Evans, C.; Kandiko Howson, C.; Forsythe, A. – Higher Education Pedagogies, 2018

Internationally, the political appetite for educational measurement capable of capturing a metric of value for money and effectiveness has momentum. While most would agree with the need to assess costs relevant to quality to help support better governmental policy decisions about public spending, poorly understood measurement comes with unintended…

Descriptors: Higher Education, Achievement Gains, Political Issues, Quality Assurance

Reliability Generalization: "Lapsus Linguae"

Direct link

Smith, Julie M. – ProQuest LLC, 2011

This study examines the proposed Reliability Generalization (RG) method for studying reliability. RG employs the application of meta-analytic techniques similar to those used in validity generalization studies to examine reliability coefficients. This study explains why RG does not provide a proper research method for the study of reliability,…

Descriptors: Reliability, Generalization, Sampling, Research Methodology

An Examination of the Relationship of Sample Size and Mean Length of Utterance for Children with Developmental Language Impairment

Peer reviewed

Direct link

Casby, Michael W. – Child Language Teaching and Therapy, 2011

Mean length of utterance (MLU) is a frequently used measure of the expressive language of young children. The suggested conventional, contemporary, clinical practice is to calculate it from a language sample of a minimum of 50 to 100 contiguous intelligible utterances. This practice places considerable strain on professionals working with young…

Descriptors: Language Impairments, Young Children, Expressive Language, Developmental Delays

Impact of Design Effects in Large-Scale District and State Assessments

Peer reviewed

Direct link

Phillips, Gary W. – Applied Measurement in Education, 2015

This article proposes that sampling design effects have potentially huge unrecognized impacts on the results reported by large-scale district and state assessments in the United States. When design effects are unrecognized and unaccounted for they lead to underestimating the sampling error in item and test statistics. Underestimating the sampling…

Descriptors: State Programs, Sampling, Research Design, Error of Measurement

Making Do with What We Have: Use Your Bootstraps

Peer reviewed

Direct link

Calmettes, Guillaume; Drummond, Gordon B.; Vowler, Sarah L. – Advances in Physiology Education, 2012

A jack knife is a pocket knife that is put to many tasks, because it's ready to hand. Often there could be a better tool for the job, such as a screwdriver, a scraper, or a can-opener, but these are not usually pocket items. In statistical terms, the expression implies making do with what's available. Another simile, of an extreme situation, is…

Descriptors: Statistical Analysis, Computation, Population Distribution, Evaluation Methods

Leveraging Data Sampling and Practical Knowledge: Field Instructors' Perceptions about Inter-Rater Reliability Data

Peer reviewed

Direct link

Soslau, Elizabeth; Lewis, Kandia – Action in Teacher Education, 2014

For accreditation and programmatic decision making, education school administrators use inter-rater reliability analyses to judge credibility of student-teacher assessments. Although weak levels of agreement between university-appointed supervisors and cooperating teachers are usually interpreted to indicate that the process is not being…

Descriptors: Interrater Reliability, Accreditation (Institutions), Student Teacher Evaluation, Focus Groups

Asking Students about Teaching: Student Perception Surveys and Their Implementation. MET Project Policy and Practice Brief

Download full text

Bill & Melinda Gates Foundation, 2012

No one has a bigger stake in teaching effectiveness than students. Nor are there any better experts on how teaching is experienced by its intended beneficiaries. Only recently have many policymakers and practitioners come to recognize that--when asked the right questions, in the right ways--students can be an important source of information on the…

Descriptors: Student Surveys, Student Attitudes, Feedback (Response), Test Validity

Using a Sampling Strategy to Address Psychometric Challenges in Tutorial-Based Assessments

Peer reviewed

Direct link

Eva, Kevin W.; Solomon, Patty; Neville, Alan J.; Ladouceur, Michael; Kaufman, Karyn; Walsh, Allyn; Norman, Geoffrey R. – Advances in Health Sciences Education, 2007

Introduction: Tutorial-based assessment, despite providing a good match with the philosophy adopted by educational programmes that emphasize small group learning, remains one of the greatest challenges for educators working in this context. The current study was performed in an attempt to assess the psychometric characteristics of tutorial-based…

Descriptors: Construct Validity, Sampling, Psychometrics, Evaluation Methods

Introduction to the Development of the ISPCAN Child Abuse Screening Tools

Peer reviewed

Direct link

Runyan, Desmond K.; Dunne, Michael P.; Zolotor, Adam J. – Child Abuse & Neglect: The International Journal, 2009

The "World Report on Children and Violence", (Pinheiro, 2006) was produced at the request of the UN Secretary General and the UN General Assembly. This report recommended improvement in research on child abuse. ISPCAN representatives took this charge and developed 3 new instruments. We describe this background and introduce three new measures…

Descriptors: Child Abuse, Screening Tests, Child Welfare, Test Construction

Estimation of Generalizability Coefficients via a Structural Equation Modeling Approach to Scale Reliability Evaluation

Peer reviewed

Direct link

Raykov, Tenko; Marcoulides, George A. – International Journal of Testing, 2006

A structural equation modeling approach to scale reliability evaluation can be employed to estimate generalizability theory indexes in settings where sampling of subjects and conditions is carried out. In one- and two-facet crossed designs, it is demonstrated how this method can be used to obtain estimates of relative generalizability…

Descriptors: Computation, Generalizability Theory, Structural Equation Models, Reliability

Acoustic and Perceptual Analysis of Speech Adaptation to an Artificial Palate

Peer reviewed

Direct link

McAuliffe, Megan J.; Robb, Michael P.; Murdoch, Bruce E. – Clinical Linguistics & Phonetics, 2007

The study investigated adaptation to a standard electropalatographic (EPG) practise palate in a group of eight adults (mean age = 24 years). The participants read the phrase "a CVC" over four sampling conditions: prior to inserting the palate, immediately following insertion of the palate, 45 minutes after palate insertion, and 3 hours after…

Descriptors: Articulation (Speech), Phonology, Sampling, Acoustics

Sample Size Determinations for the Two Rater Kappa Statistic.

Peer reviewed

Flack, Virginia F.; And Others – Psychometrika, 1988

A method is presented for determining sample size that will achieve a pre-specified bound on confidence interval width for the interrater agreement measure "kappa." The same results can be used when a pre-specified power is desired for testing hypotheses about the value of kappa. (Author/SLD)

Descriptors: Evaluation Methods, Interrater Reliability, Research Methodology, Research Problems

Previous Page | Next Page »

Pages: 1 | 2 | 3

Educational and Psychological…	2
International Journal of…	2
Action in Teacher Education	1
Advances in Health Sciences…	1
Advances in Physiology…	1
Annual Review of Applied…	1
Applied Measurement in…	1
Bill & Melinda Gates…	1
Child Abuse & Neglect: The…	1
Child Language Teaching and…	1
Clinical Linguistics &…	1
Crime & Delinquency	1
Educational Measurement:…	1
Higher Education Pedagogies	1
Journal of Applied Behavior…	1
Journal of Speech and Hearing…	1
Journal of Technical Writing…	1
National Assessment Governing…	1
Online Learning	1
ProQuest LLC	1
Psychometrika	1
Public Libraries	1
More ▼

Granville, Arthur C.	2
Marcoulides, George A.	2
Raykov, Tenko	2
Blount, Nathan S.	1
Bourque, Linda B.	1
Brown, James Dean	1
Bruininks, Robert H.	1
Calmettes, Guillaume	1
Casby, Michael W.	1
Cohen, Allan S., Comp.	1
Conley, Valerie	1
Cook, Daniel W.	1
Cooper, Paul G.	1
Craig, Holly K.	1
David Navarro-González	1
DeCantis, Genevieve	1
Drummond, Gordon B.	1
Dunne, Michael P.	1
Dziuban, Charles	1
Estes, Carole	1
Estes, Gary D.	1
Eva, Kevin W.	1
Evans, C.	1
Evans, Julia L.	1
More ▼