Publication Date
In 2025 | 2 |
Since 2024 | 15 |
Since 2021 (last 5 years) | 68 |
Since 2016 (last 10 years) | 171 |
Since 2006 (last 20 years) | 439 |
Descriptor
Source
Author
Publication Type
Education Level
Audience
Researchers | 28 |
Practitioners | 2 |
Policymakers | 1 |
Students | 1 |
Location
Turkey | 14 |
Canada | 10 |
United States | 10 |
California | 9 |
Netherlands | 9 |
Australia | 6 |
Germany | 6 |
South Korea | 6 |
Iowa | 5 |
Norway | 5 |
Turkey (Ankara) | 5 |
More ▼ |
Laws, Policies, & Programs
Individuals with Disabilities… | 2 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Sung, Yao-Ting; Chang, Kuo-En; Chang, Tzyy-Hua; Yu, Wen-Cheng – Journal of Adolescence, 2010
Self- and peer assessments are becoming more popular in classrooms, but there are few data on the reliability and validity of such assessments performed by school children. Because these factors are greatly affected by the number of raters, we conducted two studies to determine the rating behaviours of teenagers in self- and peer assessments, and…
Descriptors: Generalizability Theory, Peer Evaluation, Validity, Reliability
Brennan, Robert L. – Educational and Psychological Measurement, 2007
This article provides general procedures for obtaining unbiased estimates of variance components for any random-model balanced design under any bootstrap sampling plan, with the focus on designs of the type typically used in generalizability theory. The results reported here are particularly helpful when the bootstrap is used to estimate standard…
Descriptors: Generalizability Theory, Error of Measurement, Statistical Analysis
Leung, Kai-Kuen; Wang, Wei-Dan; Chen, Yen-Yuan – Advances in Health Sciences Education, 2012
There is a lack of information on the use of multi-source evaluation to assess trainees' interpersonal and communication skills in Oriental settings. This study is conducted to assess the reliability and applicability of assessing the interpersonal and communication skills of family medicine residents by patients, peer residents, nurses, and…
Descriptors: Foreign Countries, Clinical Teaching (Health Professions), Communication Skills, Patients
Yin, Ping; Sconing, James – Educational and Psychological Measurement, 2008
Standard-setting methods are widely used to determine cut scores on a test that examinees must meet for a certain performance standard. Because standard setting is a measurement procedure, it is important to evaluate variability of cut scores resulting from the standard-setting process. Generalizability theory is used in this study to estimate…
Descriptors: Generalizability Theory, Standard Setting, Cutting Scores, Test Items
Burch, V. C.; Norman, G. R.; Schmidt, H. G.; van der Vleuten, C. P. M. – Advances in Health Sciences Education, 2008
High stakes postgraduate specialist certification examinations have considerable implications for the future careers of examinees. Medical colleges and professional boards have a social and professional responsibility to ensure their fitness for purpose. To date there is a paucity of published data about the reliability of specialist certification…
Descriptors: Generalizability Theory, Physicians, Foreign Countries, Specialists
Gebril, Atta – Language Testing, 2009
Generalizability of writing scores has always been a longstanding concern in L2 writing assessment. A number of studies have been conducted to investigate this topic during the last two decades. However, with the introduction of new test methods, such as reading-to-write tasks, generalizability studies need to focus on the score accuracy of…
Descriptors: Generalizability Theory, Writing Evaluation, Writing Tests, Scores
Clauser, Brian E.; Harik, Polina; Margolis, Melissa J.; McManus, I. C.; Mollon, Jennifer; Chis, Liliana; Williams, Simon – Applied Measurement in Education, 2009
Numerous studies have compared the Angoff standard-setting procedure to other standard-setting methods, but relatively few studies have evaluated the procedure based on internal criteria. This study uses a generalizability theory framework to evaluate the stability of the estimated cut score. To provide a measure of internal consistency, this…
Descriptors: Generalizability Theory, Group Discussion, Standard Setting (Scoring), Scoring
Hilton, N. Zoe; Harris, Grant T. – Journal of Interpersonal Violence, 2009
Prediction effect sizes such as ROC area are important for demonstrating a risk assessment's generalizability and utility. How a study defines recidivism might affect predictive accuracy. Nonrecidivism is problematic when predicting specialized violence (e.g., domestic violence). The present study cross-validates the ability of the Ontario…
Descriptors: Recidivism, Family Violence, Prediction, Effect Size
Lei, Pui-Wa; Smith, Maria; Suen, Hoi K. – Psychology in the Schools, 2007
Direct observation of behaviors is a data collection method customarily used in clinical and educational settings. Repeated measures and small samples are inherent characteristics of observational studies that pose challenges to the numerical estimation of reliability for observational data. In this article, we review some debates about the use of…
Descriptors: Generalizability Theory, Data Collection, Observation, Evaluation Methods
Cheng, Albert – Online Submission, 2012
The Common Core State Standards Initiative is the latest development in a long history of standards-based-reform in the United States. As of November 4, 2011, 46 states and the District of Columbia have adopted new curricular standards, called the Common Core State Standards (CCSS). These states and the District of Columbia are now implementing…
Descriptors: State Standards, Academic Achievement, Educational Change, Teacher Attitudes
Lakes, Kimberley D.; Hoyt, William T. – Infant and Child Development, 2008
Cronbach and Meehl ("Psychol. Bull." 1955; 52:281-302) stated that the key question to be addressed when assessing construct validity is "What sources contribute to variance in test performance?" We illustrate the utility of generalizability theory (GT) as a conceptual framework that encourages psychological researchers to…
Descriptors: Generalizability Theory, Construct Validity, Observation, Behavior Rating Scales
Hagtvet, Knut A.; Hoglend, Per A. – Measurement and Evaluation in Counseling and Development, 2008
Precision and generalizability for relative and absolute change scores were estimated by means of error/tolerance ratios and generalizability coefficients for 51 patients receiving 1 year of psychodynamic psychotherapy. These estimations involved 6 scale indicators and 3 raters. Practical suggestions are offered for the number of rates needed to…
Descriptors: Generalizability Theory, Psychotherapy, Change, Scores
Gebril, Atta – Assessing Writing, 2010
Integrated tasks are currently employed in a number of L2 exams since they are perceived as an addition to the writing-only task type. Given this trend, the current study investigates composite score generalizability of both reading-to-write and writing-only tasks. For this purpose, a multivariate generalizability analysis is used to investigate…
Descriptors: Scoring, Scores, Second Language Instruction, Writing Evaluation
Collett, Jessica L.; Childs, Ellen – Current Research in Social Psychology, 2009
Social psychologists in both sociology and psychology commonly use vignettes to gauge how people might respond in a given situation. Research subjects in such studies, like those in other experiments, are often undergraduates, surveyed or recruited in classes. While there has been significant attention to the generalizability of students'…
Descriptors: Social Psychology, Psychologists, Sociology, Psychology
Chafouleas, Sandra M.; Christ, Theodore J.; Riley-Tillman, T. Chris – Educational and Psychological Measurement, 2009
Generalizability theory is used to examine the impact of scaling gradients on a single-item Direct Behavior Rating (DBR). A DBR refers to a type of rating scale used to efficiently record target behavior(s) following an observation occasion. Variance components associated with scale gradients are estimated using a random effects design for persons…
Descriptors: Generalizability Theory, Undergraduate Students, Scaling, Rating Scales