NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 526 to 540 of 728 results Save | Export
Peer reviewed Peer reviewed
Abedi, Jamal – Multivariate Behavioral Research, 1996
The Interrater/Test Reliability System (ITRS) is described. The ITRS is a comprehensive computer tool used to address questions of interrater reliability that computes several different indices of interrater reliability and the generalizability coefficient over raters and topics. The system is available in IBM compatible or Macintosh format. (SLD)
Descriptors: Computer Software, Computer Software Evaluation, Evaluation Methods, Evaluators
Peer reviewed Peer reviewed
Kane, Michael – Applied Measurement in Education, 1996
This overview of the role of error and tolerance for error in measurement asserts that the generic precision associated with a measurement procedure is defined as the root mean square error, or standard error, in some relevant population. This view of precision is explored in several applications of measurement. (SLD)
Descriptors: Error of Measurement, Error Patterns, Generalizability Theory, Measurement Techniques
Peer reviewed Peer reviewed
Norcini, John J.; And Others – Evaluation and the Health Professions, 1990
Aggregate scoring was applied to a recertifying examination for medical professionals to generate an answer key and allow comparison of peer examinees. Results for 1,927 candidates for recertification indicate considerable agreement between the traditional answer key and the aggregate answer key. (TJH)
Descriptors: Answer Keys, Criterion Referenced Tests, Error of Measurement, Generalizability Theory
Peer reviewed Peer reviewed
Marcoulides, George A. – Educational and Psychological Measurement, 1994
Effects of different weighting schemes on selecting the optimal number of observations in multivariate-multifacet generalizability designs are studied when cost constraints are imposed. Comparison of four schemes through simulation indicates that all four produce similar optimal values and that reliability should be similar. (SLD)
Descriptors: Budgeting, Comparative Analysis, Costs, Factor Analysis
Peer reviewed Peer reviewed
Goldstein, Zvi; Marcoulides, George A. – Educational and Psychological Measurement, 1991
An efficient search procedure is presented for determining the optimal number of observations of facets in a design that maximize generalizability when resource constraints are imposed. The procedure is illustrated for three-facet and four-facet designs, with extensions for other configurations. (Author/SLD)
Descriptors: Cost Effectiveness, Decision Making, Equations (Mathematics), Generalizability Theory
Peer reviewed Peer reviewed
Smith, Philip L.; Luecht, Richard M. – Applied Psychological Measurement, 1992
The implications of serially correlated effects on the results of generalizability analyses are discussed. Simulated data are provided that demonstrate the biases that serially correlated effects introduce into the results. Serial correlation in measurement effects can have a marked influence on the impression of the dependability of measurement…
Descriptors: Computer Simulation, Correlation, Equations (Mathematics), Estimation (Mathematics)
Peer reviewed Peer reviewed
Goodwin, Laura D.; And Others – Journal of Special Education, 1991
Using data from an individually administered interview schedule (the Consumer Satisfaction Inventory), reliability among nine interviewers was estimated with several statistical methods, including simple percentages of agreement, kappa and weighted kappa, Pearson correlations, t tests on interviewers' means, and generalizability theory techniques.…
Descriptors: Disabilities, Educational Research, Elementary Secondary Education, Estimation (Mathematics)
Peer reviewed Peer reviewed
Swartz, Carl W.; Hooper, Stephen R.; Mongomery, James W.; Wakely, Melissa B.; De Kruif, Renee E. L.; Reed, Martha; Brown, Timothy T.; Levine, Melvin D.; White, Kinnard P. – Educational and Psychological Measurement, 1999
Used generalizability theory to investigate the impact of the number of raters and the type of decision (relative versus absolute) on the reliability of writing scores. Results from 251 middle school students and 20 intermediate grade students show that reliability coefficients decline as the number of raters declines and when absolute decisions…
Descriptors: Estimation (Mathematics), Generalizability Theory, Holistic Evaluation, Intermediate Grades
Peer reviewed Peer reviewed
Hoyt, William T.; Melby, Janet N. – Counseling Psychologist, 1999
Addresses generalizability theory (GT), which offers a flexible framework for assessing dependability of measurement. GT allows for consideration of multiple sources of error, allowing investigators to assess the overall impact of measurement error. Illustrative analyses demonstrate the special advantages of GT for planning studies in which…
Descriptors: Counseling Psychology, Generalizability Theory, Measurement, Research Design
Peer reviewed Peer reviewed
Norcini, John; Grosso, Lou – Applied Measurement in Education, 1998
Ratings of test item relevance were collected from 57 practitioners from a pretest of a medical certifying examination. Ratings were correlated with item difficulty, but the relationship between ratings and item discrimination was less clear. Application of generalizability theory shows that reasonable estimates of item, stem, and total test…
Descriptors: Certification, Difficulty Level, Estimation (Mathematics), Generalizability Theory
Peer reviewed Peer reviewed
Heck, Ronald H.; Johnsrud, Linda K.; Rosser, Vicki J. – Research in Higher Education, 2000
Little research exists on the assessment of administrators' performance in higher education. The authors offer an evaluation model for assessing and monitoring the effectiveness of academic deans and directors, using generalizability theory as a basis for developing more accurate assessment procedures. (JM)
Descriptors: Academic Deans, Administrator Effectiveness, Administrator Evaluation, College Administration
Peer reviewed Peer reviewed
Direct linkDirect link
Yin, Ping – Educational and Psychological Measurement, 2005
The main purpose of this study is to examine the content structure of the Multistate Bar Examination (MBE) using the "table of specifications" model from the perspective of multivariate generalizability theory. Specifically, using MBE data collected over different years (six administrations: three from the February test and three from July test),…
Descriptors: Correlation, Generalizability Theory, Statistical Analysis, Multivariate Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Johnson, Robert L.; Penny, James; Gordon, Belita; Shumate, Steven R.; Fisher, Steven P. – Language Assessment Quarterly, 2005
Many studies have indicated that at least 2 raters should score writing assessments to improve interrater reliability. However, even for assessments that characteristically demonstrate high levels of rater agreement, 2 raters of the same essay can occasionally report different, or discrepant, scores. If a single score, typically referred to as an…
Descriptors: Interrater Reliability, Scores, Evaluation, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Stern, Hal S. – Psychological Methods, 2005
I. Klugkist, O. Laudy, and H. Hoijtink (2005) presented a Bayesian approach to analysis of variance models with inequality constraints. Constraints may play 2 distinct roles in data analysis. They may represent prior information that allows more precise inferences regarding parameter values, or they may describe a theory to be judged against the…
Descriptors: Probability, Inferences, Bayesian Statistics, Data Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Cheong, Yuk Fai – International Journal of Testing, 2006
This article considers and illustrates a strategy to study effects of school context on differential item functioning (DIF) in large-scale assessment. The approach employs a hierarchical generalized linear modeling framework to (a) detect DIF, and (b) identify school-level correlates of the between-group differences in item performance. To…
Descriptors: Context Effect, Test Bias, Causal Models, Educational Assessment
Pages: 1  |  ...  |  32  |  33  |  34  |  35  |  36  |  37  |  38  |  39  |  40  |  ...  |  49