Publication Date
In 2025 | 1 |
Since 2024 | 4 |
Since 2021 (last 5 years) | 11 |
Since 2016 (last 10 years) | 34 |
Since 2006 (last 20 years) | 58 |
Descriptor
Test Bias | 133 |
Test Construction | 133 |
Test Reliability | 123 |
Test Validity | 95 |
Test Items | 35 |
Testing | 29 |
Scores | 25 |
Scoring | 25 |
Item Response Theory | 23 |
Student Evaluation | 23 |
Testing Problems | 23 |
More ▼ |
Source
Author
Publication Type
Education Level
Location
Illinois | 4 |
New York | 4 |
California | 2 |
Florida | 2 |
New Jersey | 2 |
Pennsylvania | 2 |
Texas | 2 |
Turkey | 2 |
Delaware | 1 |
Hong Kong | 1 |
Israel | 1 |
More ▼ |
Laws, Policies, & Programs
Every Student Succeeds Act… | 3 |
Individuals with Disabilities… | 3 |
Rehabilitation Act 1973… | 3 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Meets WWC Standards without Reservations | 1 |
Meets WWC Standards with or without Reservations | 1 |
Hwanggyu Lim; Danqi Zhu; Edison M. Choe; Kyung T. Han – Journal of Educational Measurement, 2024
This study presents a generalized version of the residual differential item functioning (RDIF) detection framework in item response theory, named GRDIF, to analyze differential item functioning (DIF) in multiple groups. The GRDIF framework retains the advantages of the original RDIF framework, such as computational efficiency and ease of…
Descriptors: Item Response Theory, Test Bias, Test Reliability, Test Construction
Hung-Yu Huang – Educational and Psychological Measurement, 2025
The use of discrete categorical formats to assess psychological traits has a long-standing tradition that is deeply embedded in item response theory models. The increasing prevalence and endorsement of computer- or web-based testing has led to greater focus on continuous response formats, which offer numerous advantages in both respondent…
Descriptors: Response Style (Tests), Psychological Characteristics, Item Response Theory, Test Reliability
Brennan, Robert L.; Kim, Stella Y.; Lee, Won-Chan – Educational and Psychological Measurement, 2022
This article extends multivariate generalizability theory (MGT) to tests with different random-effects designs for each level of a fixed facet. There are numerous situations in which the design of a test and the resulting data structure are not definable by a single design. One example is mixed-format tests that are composed of multiple-choice and…
Descriptors: Multivariate Analysis, Generalizability Theory, Multiple Choice Tests, Test Construction
Lu, Hong; Wang, Yue; Wu, Xiu-yuan; Lu, Xiao-feng; Liu, Xing-bo – Educational Technology Research and Development, 2023
The current study aimed to measure preservice teachers' skills in implementing pedagogical model of collaborative problem solving (CPS) using a more advanced and novel human-to-agent computerized assessment instrument. By doing so, a framework with three major skills in the implementation of CPS and four major individual problem-solving processes…
Descriptors: Preservice Teachers, Cooperative Learning, Problem Solving, Artificial Intelligence
Benjamin W. Y. Hornsby; Stephen Camarata; Sun-Joo Cho; Hilary Davis; Ronan McGarrigle; Fred H. Bess – Journal of Speech, Language, and Hearing Research, 2022
Purpose: Growing evidence suggests that fatigue associated with listening difficulties is particularly problematic for children with hearing loss (CHL). However, sensitive, reliable, and valid measures of listening-related fatigue do not exist. To address this gap, this article describes the development, psychometric evaluation, and preliminary…
Descriptors: Test Construction, Fatigue (Biology), Hearing Impairments, Listening
Rodriguez, Rebekah M.; Silvia, Paul J.; Kaufman, James C.; Reiter-Palmon, Roni; Puryear, Jeb S. – Creativity Research Journal, 2023
The original 90-item Creative Behavior Inventory (CBI) was a landmark self-report scale in creativity research, and the 28-item brief form developed nearly 20 years ago continues to be a popular measure of everyday creativity. Relatively little is known, however, about the psychometric properties of this widely used scale. In the current research,…
Descriptors: Creativity Tests, Creativity, Creative Thinking, Psychometrics
You, Hye Sun; Park, Sunyoung; Marshall, Jill A.; Delgado, Cesar – Research in Science Education, 2022
Growing interest in interdisciplinary (ID) understanding has led to the recent development of four ID assessments, none of which have previously been comprehensively validated. Sources of evidence for the validity of tests include construct validity, such as the internal structure of the test. ID tests may (and should) test both disciplinary (D)…
Descriptors: High School Students, College Students, Interdisciplinary Approach, Test Construction
Jones, Andrew T.; Kopp, Jason P.; Ong, Thai Q. – Educational Measurement: Issues and Practice, 2020
Studies investigating invariance have often been limited to measurement or prediction invariance. Selection invariance, wherein the use of test scores for classification results in equivalent classification accuracy between groups, has received comparatively little attention in the psychometric literature. Previous research suggests that some form…
Descriptors: Test Construction, Test Bias, Classification, Accuracy
Rosario A. Marroquín-Flores; Rose Marie Tijerina; Mason Tedeschi; Sofia Banjara; Redmon Warmsley; Luke McFather; Zianna Casas; Lisa B. Limeri – CBE - Life Sciences Education, 2024
Students who hold minoritized identities are underrepresented in science, technology, engineering, and math (STEM) fields. Educational institutions often apply a deficit lens to understanding disproportionate outcomes between minoritized students and those from the cultural majority. Community Cultural Wealth (CCW) is an asset-based framework that…
Descriptors: Undergraduate Students, Minority Group Students, Low Income Students, STEM Education
Areekkuzhiyil, Santhosh – Online Submission, 2021
Assessment is an integral part of any teaching learning process. Assessment has large number of functions to perform, whether it is formative or summative. This paper analyse the issues involved and the areas of concern in the classroom assessment practice and discusses the recent reforms take place. [This paper was published in Edutracks v20 n8…
Descriptors: Student Evaluation, Formative Evaluation, Summative Evaluation, Test Validity
Patrick C. Kyllonen; Amit Sevak; Teresa Ober; Ikkyu Choi; Jesse Sparks; Daniel Fishtein – ETS Research Institute, 2024
Assessment refers to a broad array of approaches for measuring or evaluating a person's (or group of persons') skills, behaviors, dispositions, or other attributes. Assessments range from standardized tests used in admissions, employee selection, licensure examinations, and domestic and international largescale assessments of cognitive and…
Descriptors: Performance Based Assessment, Evaluation Criteria, Evaluation Methods, Test Bias
Smith, Robert L.; Karaman, Mehmet A. – International Journal of Psychology and Educational Studies, 2019
This study investigated the factorial validity of the Contextual Achievement Motivation Scale, assessing achievement motivation in multiple settings with a sample of 493 undergraduate and graduate students. Exploratory factor analysis identified a four-factor model: School (6 items), Employment/Work (6 items), Family (5 items), Community (4…
Descriptors: Achievement Need, Measures (Individuals), Test Construction, Test Validity
Rios, Joseph A.; Sparks, Jesse R.; Zhang, Mo; Liu, Ou Lydia – ETS Research Report Series, 2017
Proficiency with written communication (WC) is critical for success in college and careers. As a result, institutions face a growing challenge to accurately evaluate their students' writing skills to obtain data that can support demands of accreditation, accountability, or curricular improvement. Many current standardized measures, however, lack…
Descriptors: Test Construction, Test Validity, Writing Tests, College Outcomes Assessment
International Journal of Testing, 2019
These guidelines describe considerations relevant to the assessment of test takers in or across countries or regions that are linguistically or culturally diverse. The guidelines were developed by a committee of experts to help inform test developers, psychometricians, test users, and test administrators about fairness issues in support of the…
Descriptors: Test Bias, Student Diversity, Cultural Differences, Language Usage
Ray, Brian; Babb, Jacob; Wooten, Courtney Adams – Composition Studies, 2018
Student evaluations of teaching (SETs) are frequently used to assess college teachers. However, education research has shown that there is potential for bias in SETs, especially based on instructor variables. Aside from Amy Dayton's 2015 work on assessment that advises using SETs only in concert with other measures, English studies scholars have…
Descriptors: Student Evaluation of Teacher Performance, Teacher Evaluation, Educational History, Test Bias