Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 6 |
Descriptor
Higher Education | 44 |
Test Bias | 44 |
Test Reliability | 38 |
Test Validity | 24 |
Test Construction | 14 |
Evaluation Methods | 11 |
College Entrance Examinations | 10 |
Scores | 9 |
Standardized Tests | 9 |
Testing Problems | 9 |
Test Interpretation | 7 |
More ▼ |
Source
Author
Ackerman, Michael | 1 |
Avila, Dolores R. | 1 |
Babb, Jacob | 1 |
Banville, Dominique | 1 |
Bardo, John W. | 1 |
Beller, Michal | 1 |
Bennett, Randy Elliot | 1 |
Branthwaite, Alan | 1 |
Brennan, David J. | 1 |
Corradi, David | 1 |
Craig, Robert | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 6 |
Postsecondary Education | 2 |
Secondary Education | 1 |
Two Year Colleges | 1 |
Location
Israel | 2 |
California | 1 |
China | 1 |
Laws, Policies, & Programs
Bakke v Regents of University… | 1 |
Assessments and Surveys
SAT (College Admission Test) | 3 |
Beck Depression Inventory | 1 |
Defining Issues Test | 1 |
General Aptitude Test Battery | 1 |
Graduate Management Admission… | 1 |
Graduate Record Examinations | 1 |
Students Evaluation of… | 1 |
What Works Clearinghouse Rating
Corradi, David – Assessment & Evaluation in Higher Education, 2023
Juries are a high-stake practice in higher education to assess complex competencies. However common, research remains behind in detailing the psychometric qualities of juries, especially when using rubrics or rating scales as an assessment tool. In this study, I analyze a case of a jury assessment (N = 191) of product development where both…
Descriptors: Court Litigation, Educational Practices, Higher Education, Rating Scales
Ray, Brian; Babb, Jacob; Wooten, Courtney Adams – Composition Studies, 2018
Student evaluations of teaching (SETs) are frequently used to assess college teachers. However, education research has shown that there is potential for bias in SETs, especially based on instructor variables. Aside from Amy Dayton's 2015 work on assessment that advises using SETs only in concert with other measures, English studies scholars have…
Descriptors: Student Evaluation of Teacher Performance, Teacher Evaluation, Educational History, Test Bias
Liu, Ou Lydia; Mao, Liyang; Zhao, Tingting; Yang, Yi; Xu, Jun; Wang, Zhen – ETS Research Report Series, 2016
Chinese higher education is experiencing rapid development and growth. With tremendous resources invested in higher education, policy makers have requested more direct evidence of student learning. However, assessment tools that can be used to measure college-level learning are scarce in China. To mitigate this situation, we translated the…
Descriptors: Foreign Countries, Higher Education, Critical Thinking, College Students
Wu, Pei-Chen; Huang, Tsai-Wei – Measurement and Evaluation in Counseling and Development, 2010
This study was to apply the mixed Rasch model to investigate person heterogeneity of Beck Depression Inventory-II-Chinese version (BDI-II-C) and its effects on dimensionality and construct validity. Person heterogeneity was reflected by two latent classes that differ qualitatively. Additionally, person heterogeneity adversely affected the…
Descriptors: Construct Validity, Validity, Depression (Psychology), Item Response Theory
Brennan, David J. – Higher Education Research and Development, 2008
This paper provides an overview of the issue of student anonymity in the summative assessment of student work in higher education. It considers both theoretical literature pertaining to bias in the evaluation of the work of others and the limited empirical work undertaken on this issue in higher education. It then describes the experience of three…
Descriptors: Higher Education, Student Evaluation, Interrater Reliability, Test Bias

Newstead, Stephen E.; Dennis, Ian – Assessment and Evaluation in Higher Education, 1990
Three studies investigating the existence of sex bias in the grading of undergraduate students, by examining interrater reliability for blind and non-blind grading, are reported. Negative evidence found in the results and the confusing picture presented by previous research indicate little firm evidence of sex bias in grading. (Author/MSE)
Descriptors: Evaluation Methods, Grading, Higher Education, Interrater Reliability

Napier, John D. – Journal of Psychology, 1979
Support claims that the "Defining Issues Test" of cognitive-moral development cannot be faked higher. Finds that instruction about cognitive-moral development affected the scores of the teacher trainees who were tested. (RL)
Descriptors: Cognitive Development, Higher Education, Moral Development, Test Bias

Saunders, Phillip; Welsh, Arthur L. – Journal of Economic Education, 1975
Compared the "hybrid" Test of Understanding in College Economics (TUCE) with the four original TUCE versions and found that the hybrid version is 1) "broader,""thinner," and less technical in terms of content coverage; 2) more reliable with a generally superior item analysis structure; and 3) slightly "easier" for students who have taken economics…
Descriptors: Economics, Economics Education, Educational Testing, Evaluation

Bardo, John W.; Yeager, Samuel J. – Perceptual and Motor Skills, 1982
Responses to various fixed test-response formats were examined for "reliability" due to systematic error; Cronbach's alphas up to .67 were obtained. Of formats tested, four-point Likert Scales were least affected while forms of lines and faces were most problematic. Possible modification in alpha to account for systematic bias is…
Descriptors: Higher Education, Measures (Individuals), Psychometrics, Response Style (Tests)

Nelson, Jack K.; Dorociak, Jeff J. – Journal of Physical Education, Recreation & Dance, 1982
Test measurement, reliability, and validity are discussed in relation to methods of physical fitness testing. A successful testing method which involved students testing their peers is described, showing the administration of various test items and the use of test practice procedures. (JN)
Descriptors: Higher Education, Physical Education, Physical Fitness, Student Participation

Branthwaite, Alan; And Others – Educational Review, 1981
In this naturalistic study of essay marking, 15 university lecturers graded an examination paper and completed the Eysenck Personality Questionnaire. A significant positive correlation was found between the marks given and the grader's lie score, indicating possible effects of staff-student interactions or social desirability on biases in grading.…
Descriptors: Essay Tests, Experimenter Characteristics, Higher Education, Personality Traits
Sinnott, Loraine T. – 1982
A standard method for exploring item bias is the intergroup comparison of item difficulties. This paper describes a refinement and generalization of this technique. In contrast to prior approaches, the proposed method deletes outlying items from the formulation of a criterion for identifying items as deviant. It also extends the mathematical…
Descriptors: College Entrance Examinations, Difficulty Level, Higher Education, Item Analysis

Wen, Shih-Sung – Journal of Educational Measurement, 1975
The relationship between students' scores on a verbal meaning test and their degrees of confidence in item responses was investigated. Subjects were black undergraduate students and they were administered a verbal meaning test by following a confidence testing procedure. (Author/BJG)
Descriptors: Blacks, Confidence Testing, Higher Education, Language Skills
Willmington, S. Clay; Steinbrecher, Milda M. – 1993
A "Fundamentals of Speech Communication" course is required of all college students, and upon completion of such a course students should possess those basic speaking and listening skills necessary to complete successfully their college educations. With a view toward developing a new, more effective listening test, a study examined…
Descriptors: Communication Research, Higher Education, Introductory Courses, Listening Comprehension
Webb, Lynn C.; And Others – 1990
Two aspects of rater accuracy in performance assessment were analyzed: rater stringency/leniency, and rater consistency. Data were obtained from three administrations of an oral certification examination in a health profession. The examination consists of clinical cases in four content areas or subspecialities. A total of 364 candidates were…
Descriptors: Allied Health Occupations, Evaluation Methods, Evaluators, Higher Education