ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	2
Since 2016 (last 10 years)	6
Since 2006 (last 20 years)	37

Descriptor

Reliability	50
Test Bias	50
Validity	16
Scores	14
Test Items	11
Item Response Theory	10
Student Evaluation	9
Test Construction	8
Factor Analysis	7
Foreign Countries	7
Measurement	7
Statistical Analysis	7
Correlation	6
Educational Assessment	6
English (Second Language)	6
Item Analysis	6
Measures (Individuals)	6
Psychometrics	6
Second Language Learning	6
Test Selection	6
Comparative Analysis	5
Construct Validity	5
Factor Structure	5
Federal Legislation	5
Achievement Tests	4
More ▼

Publication Type

Journal Articles	32
Reports - Research	28
Reports - Evaluative	8
Reports - Descriptive	7
Speeches/Meeting Papers	4
Guides - Non-Classroom	3
Information Analyses	2
Dissertations/Theses -…	1
Opinion Papers	1

Education Level

Elementary Secondary Education	9
Higher Education	6
Postsecondary Education	5
Grade 8	3
Elementary Education	2
Grade 5	2
Secondary Education	2
Grade 4	1
Grade 7	1
Grade 9	1
High Schools	1
More ▼

Audience

Researchers	2
Administrators	1
Practitioners	1

Location

California	2
Canada	2
Alabama	1
China	1
Hong Kong	1
Michigan (Detroit)	1
Taiwan	1
Texas	1
United States	1

Laws, Policies, & Programs

No Child Left Behind Act 2001

Assessments and Surveys

Program for International…	3
Iowa Tests of Basic Skills	2
ACT Assessment	1
Beck Depression Inventory	1
Brief Symptom Inventory	1
Law School Admission Test	1
Motivated Strategies for…	1
Raven Progressive Matrices	1
Stanford Achievement Tests	1
Students Evaluation of…	1
Test of English as a Foreign…	1
Wechsler Intelligence Scale…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 50 results Save | Export

Using Regularization to Identify Measurement Bias across Multiple Background Characteristics: A Penalized Expectation-Maximization Algorithm

Peer reviewed

Direct link

William C. M. Belzak; Daniel J. Bauer – Journal of Educational and Behavioral Statistics, 2024

Testing for differential item functioning (DIF) has undergone rapid statistical developments recently. Moderated nonlinear factor analysis (MNLFA) allows for simultaneous testing of DIF among multiple categorical and continuous covariates (e.g., sex, age, ethnicity, etc.), and regularization has shown promising results for identifying DIF among…

Descriptors: Test Bias, Algorithms, Factor Analysis, Error of Measurement

Digital Module 12: Think-Aloud Interviews and Cognitive Labs https://ncme.elevate.commpartners.com

Peer reviewed

Direct link

Leighton, Jacqueline P.; Lehman, Blair – Educational Measurement: Issues and Practice, 2020

In this digital ITEMS module, Dr. Jacqueline Leighton and Dr. Blair Lehman review differences between think-aloud interviews to measure problem-solving processes and cognitive labs to measure comprehension processes. Learners are introduced to historical, theoretical, and procedural differences between these methods and how to use and analyze…

Descriptors: Protocol Analysis, Interviews, Problem Solving, Cognitive Processes

Mountain or Molehill? A Simulation Study on the Impact of Response Styles

Peer reviewed

Direct link

Plieninger, Hansjörg – Educational and Psychological Measurement, 2017

Even though there is an increasing interest in response styles, the field lacks a systematic investigation of the bias that response styles potentially cause. Therefore, a simulation was carried out to study this phenomenon with a focus on applied settings (reliability, validity, scale scores). The influence of acquiescence and extreme response…

Descriptors: Response Style (Tests), Test Bias, Item Response Theory, Correlation

Pilot Evaluation of the Computer-Based Assessment of Non-Cognitive Attributes of Health Professionals (CANA-HP)

Direct link

Sara Faye Maher – ProQuest LLC, 2020

To meet the needs of complex and/or underserved patient populations, health care professionals must possess diverse backgrounds, qualities, and skill sets. Holistic review has been used to diversify student admissions through examination of non-cognitive attributes of health care applicants. The objective of this study was to develop a novel…

Descriptors: Computer Assisted Testing, Pilot Projects, Measures (Individuals), Reliability

Psychometric Properties of the Chinese Parent Version of the Autism Spectrum Rating Scale: Rasch Analysis

Peer reviewed

Direct link

Yan, Weili; Siegert, Richard J.; Zhou, Hao; Zou, Xiaobing; Wu, Lijie; Luo, Xuerong; Li, Tingyu; Huang, Yi; Guan, Hongyan; Chen, Xiang; Mao, Meng; Xia, Kun; Zhang, Lan; Li, Erzhen; Li, Chunpei; Zhang, Xudong; Zhou, Yuanfeng; Shih, Andy; Fombonne, Eric; Zheng, Yi; Han, Jisheng; Sun, Zhongsheng; Jiang, Yong-hui; Wang, Yi – Autism: The International Journal of Research and Practice, 2021

The recent adaptation of a Chinese parent version of the Autism Spectrum Rating Scale showed the Modified Chinese Autism Spectrum Rating Scale to be reliable and valid for use in China. The aim of this study was to test the Modified Chinese Autism Spectrum Rating Scale for fit to the Rasch model. We analysed data from a previous study of the…

Descriptors: Psychometrics, Parent Attitudes, Autism, Pervasive Developmental Disorders

Administrators Gaming Test- and Observation-Based Teacher Evaluation Methods: To Conform To or Confront the System

Peer reviewed

Direct link

Geiger, Tray J.; Amrein-Beardsley, Audrey – AASA Journal of Scholarship & Practice, 2017

In this commentary, we discuss three types of data manipulations that can occur within teacher evaluation methods: artificial inflation, artificial deflation, and artificial conflation. These types of manipulation are more popularly known in the education profession as instances of Campbell's Law (1976), which states that the higher the…

Descriptors: Teacher Evaluation, Evaluation Methods, Data Analysis, Personnel Policy

Estimating Variance Components from Sparse Data Matrices in Large-Scale Educational Assessments

Peer reviewed

Direct link

DeMars, Christine – Applied Measurement in Education, 2015

In generalizability theory studies in large-scale testing contexts, sometimes a facet is very sparsely crossed with the object of measurement. For example, when assessments are scored by human raters, it may not be practical to have every rater score all students. Sometimes the scoring is systematically designed such that the raters are…

Descriptors: Educational Assessment, Measurement, Data, Generalizability Theory

On the Bias-Amplifying Effect of Near Instruments in Observational Studies

Peer reviewed
PDF on ERIC

Download full text

Steiner, Peter M.; Kim, Yongnam – Society for Research on Educational Effectiveness, 2014

In contrast to randomized experiments, the estimation of unbiased treatment effects from observational data requires an analysis that conditions on all confounding covariates. Conditioning on covariates can be done via standard parametric regression techniques or nonparametric matching like propensity score (PS) matching. The regression or…

Descriptors: Observation, Research Methodology, Test Bias, Regression (Statistics)

Exploring Item Order in Anxiety-Related Constructs: Practical Impacts of Serial Position

Peer reviewed
PDF on ERIC

Download full text

Carleton, R. Nicholas; Thibodeau, Michel A.; Osborne, Jason W.; Asmundson, Gordon J. G. – Practical Assessment, Research & Evaluation, 2012

The present study was designed to test for item order effects by measuring four distinct constructs that contribute substantively to anxiety-related psychopathology (i.e., anxiety sensitivity, fear of negative evaluation, injury/illness sensitivity, and intolerance of uncertainty). Participants (n = 999; 71% women) were randomly assigned to…

Descriptors: Anxiety, Test Items, Serial Ordering, Measures (Individuals)

Applying Longitudinal Mean and Covariance Structures (LMACS) Analysis to Assess Construct Stability Over Two Time Points: An Example Using Psychological Entitlement

Peer reviewed

Direct link

Bashkov, Bozhidar M.; Finney, Sara J. – Measurement and Evaluation in Counseling and Development, 2013

Traditional methods of assessing construct stability are reviewed and longitudinal mean and covariance structures (LMACS) analysis, a modern approach, is didactically illustrated using psychological entitlement data. Measurement invariance and latent variable stability results are interpreted, emphasizing substantive implications for educators and…

Descriptors: Statistical Analysis, Longitudinal Studies, Reliability, Psychological Patterns

Sequential Effects in Essay Ratings

Peer reviewed

Direct link

Attali, Yigal – Educational and Psychological Measurement, 2011

Contrary to previous research on sequential ratings of student performance, this study found that professional essay raters of a large-scale standardized testing program produced ratings that were drawn toward previous ratings, creating an assimilation effect. Longer intervals between the two adjacent ratings and higher degree of agreement with…

Descriptors: Essay Tests, Standardized Tests, Sequential Approach, Test Bias

Do Different Approaches to Examining Construct Comparability in Multilanguage Assessments Lead to Similar Conclusions?

Peer reviewed

Direct link

Oliveri, Maria E.; Ercikan, Kadriye – Applied Measurement in Education, 2011

In this study, we examine the degree of construct comparability and possible sources of incomparability of the English and French versions of the Programme for International Student Assessment (PISA) 2003 problem-solving measure administered in Canada. Several approaches were used to examine construct comparability at the test- (examination of…

Descriptors: Foreign Countries, English, French, Tests

Investigating ESL Students' Performance on Outcomes Assessments in Higher Education

Peer reviewed

Direct link

Lakin, Joni M.; Elliott, Diane Cardenas; Liu, Ou Lydia – Educational and Psychological Measurement, 2012

Outcomes assessments are gaining great attention in higher education because of increased demand for accountability. These assessments are widely used by U.S. higher education institutions to measure students' college-level knowledge and skills, including students who speak English as a second language (ESL). For the past decade, the increasing…

Descriptors: College Outcomes Assessment, Achievement Tests, English Language Learners, College Students

Item Screening in Graphical Loglinear Rasch Models

Peer reviewed

Direct link

Kreiner, Svend; Christensen, Karl Bang – Psychometrika, 2011

In behavioural sciences, local dependence and DIF are common, and purification procedures that eliminate items with these weaknesses often result in short scales with poor reliability. Graphical loglinear Rasch models (Kreiner & Christensen, in "Statistical Methods for Quality of Life Studies," ed. by M. Mesbah, F.C. Cole & M.T.…

Descriptors: Evidence, Markov Processes, Quality of Life, Item Analysis

Methodologies for Investigating Item- and Test-Level Measurement Equivalence in International Large-Scale Assessments

Peer reviewed

Direct link

Oliveri, Maria Elena; Olson, Brent F.; Ercikan, Kadriye; Zumbo, Bruno D. – International Journal of Testing, 2012

In this study, the Canadian English and French versions of the Problem-Solving Measure of the Programme for International Student Assessment 2003 were examined to investigate their degree of measurement comparability at the item- and test-levels. Three methods of differential item functioning (DIF) were compared: parametric and nonparametric item…

Descriptors: Foreign Students, Test Bias, Speech Communication, Effect Size

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4

Assessment and Accountability…	7
Educational and Psychological…	4
International Journal of…	3
Applied Measurement in…	2
Measurement and Evaluation in…	2
Psychometrika	2
AASA Journal of Scholarship &…	1
Applied Psychological…	1
Assessing Writing	1
Assessment	1
Assessment in Education:…	1
Autism: The International…	1
ETS Research Report Series	1
Educational Assessment	1
Educational Measurement:…	1
Educational Testing Service	1
Intelligence	1
International Journal of…	1
Journal of Consulting and…	1
Journal of Educational…	1
Journal of Educational and…	1
Journal of Vocational Behavior	1
Language Assessment Quarterly	1
Practical Assessment,…	1
ProQuest LLC	1
More ▼

Herman, Joan L.	4
Dietel, Ronald	2
Ercikan, Kadriye	2
Gallagher, Carole	2
Goldschmidt, Pete	2
Heritage, Margaret	2
Lagunoff, Rachel	2
Osmundson, Ellen	2
Sato, Edynn	2
Steinberg, Jonathan	2
Worth, Peter	2
Young, John W.	2
Abad, Francisco J.	1
Amrein-Beardsley, Audrey	1
Asmundson, Gordon J. G.	1
Attali, Yigal	1
Barkhuizen, Gary	1
Bashkov, Bozhidar M.	1
Boyce, W.	1
Carleton, R. Nicholas	1
Chen, Vincent	1
Chen, Xiang	1
Cheng, Ying-Yao	1
Cho, Yeonsuk	1
More ▼