NotesFAQContact Us
Collection
Advanced
Search Tips
What Works Clearinghouse Rating
Showing 1 to 15 of 155 results Save | Export
Peer reviewed Peer reviewed
Direct linkDirect link
Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022
We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…
Descriptors: Science Tests, Test Validity, Test Items, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Hargreaves, A. – Journal of Educational Change, 2020
This paper analyzes the nature and perceived effects of mid-stakes testing (known as the EQAO) in Ontario, Canada. Ontario's mid-stakes tests were meant to ensure accountability and transparency, and assure system-wide improvement, while avoiding the negative effects and perverse incentives of their high-stakes counterparts. The paper provides new…
Descriptors: Foreign Countries, Educational Testing, School Districts, Educational Change
W. Jake Thompson – Grantee Submission, 2023
In educational and psychological research, we are often interested in discrete latent states of individuals responding to an assessment (e.g., proficiency or non-proficiency on educational standards, the presence or absence of a psychological disorder). Diagnostic classification models (DCMs; also called cognitive diagnostic models [CDMs]) are a…
Descriptors: Bayesian Statistics, Measurement, Psychometrics, Educational Research
Peer reviewed Peer reviewed
Direct linkDirect link
Carney, Michele; Crawford, Angela; Siebert, Carl; Osguthorpe, Rich; Thiede, Keith – Applied Measurement in Education, 2019
The "Standards for Educational and Psychological Testing" recommend an argument-based approach to validation that involves a clear statement of the intended interpretation and use of test scores, the identification of the underlying assumptions and inferences in that statement--termed the interpretation/use argument, and gathering of…
Descriptors: Inquiry, Test Interpretation, Validity, Scores
Peer reviewed Peer reviewed
Direct linkDirect link
Lenz, A. Stephen; Ault, Haley; Balkin, Richard S.; Barrio Minton, Casey; Erford, Bradley T.; Hays, Danica G.; Kim, Bryan S. K.; Li, Chi – Measurement and Evaluation in Counseling and Development, 2022
In April 2021, The Association for Assessment and Research in Counseling Executive Council commissioned a time-referenced task group to revise the Responsibilities of Users of Standardized Tests (RUST) Statement (3rd edition) published by the Association for Assessment in Counseling (AAC) in 2003. The task group developed a work plan to implement…
Descriptors: Responsibility, Standardized Tests, Counselor Training, Ethics
Peer reviewed Peer reviewed
Direct linkDirect link
Volante, Louis; DeLuca, Christopher; Adie, Lenore; Baker, Eva; Harju-Luukkainen, Heidi; Heritage, Margaret; Schneider, Christoph; Stobart, Gordon; Tan, Kelvin; Wyatt-Smith, Claire – Educational Measurement: Issues and Practice, 2020
The synergy, or lack thereof, between large-scale and classroom assessment has been fiercely debated in both academic and policy spheres for decades around the world. This paper seeks to explicate how different countries are utilizing large-scale testing and test results at the classroom level. Through country profiles, this paper analyzes…
Descriptors: Educational Trends, Trend Analysis, Measurement, Teaching Methods
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Tannenbaum, Richard J.; Kane, Michael T. – ETS Research Report Series, 2019
Testing programs are often classified as high or low stakes to indicate how stringently they need to be evaluated. However, in practice, this classification falls short. A high-stakes label is taken to imply that all indicators of measurement quality must meet high standards; whereas a low-stakes label is taken to imply the opposite. This approach…
Descriptors: High Stakes Tests, Testing Programs, Measurement, Evaluation Criteria
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Gawliczek, Piotr; Krykun, Viktoriia; Tarasenko, Nataliya; Tyshchenko, Maksym; Shapran, Oleksandr – Advanced Education, 2021
The article deals with the innovative, cutting age solution within the language testing realm, namely computer adaptive language testing (CALT) in accordance with the NATO Standardization Agreement 6001 (NATO STANAG 6001) requirements for further implementation in foreign language training of personnel of the Armed Forces of Ukraine (AF of…
Descriptors: Computer Assisted Testing, Adaptive Testing, Language Tests, Second Language Instruction
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Reckase, Mark D. – ETS Research Report Series, 2017
A common interpretation of achievement test results is that they provide measures of achievement that are much like other measures we commonly use for height, weight, or the cost of goods. In a limited sense, such interpretations are correct, but some nuances of these interpretations have important implications for the use of achievement test…
Descriptors: Models, Achievement Tests, Test Results, Test Construction
Peer reviewed Peer reviewed
Direct linkDirect link
Bachman, Lyle – Measurement: Interdisciplinary Research and Perspectives, 2013
At the outset of his thoughtful and thought-provoking article, Haertel (this issue) clearly identifies the issue with which he will be dealing: The disjunct, or gap, in current approaches to evaluating the merits of a given test, between the intended uses of that test and the validity of its score-based interpretations. The author thinks that…
Descriptors: Educational Testing, Test Use, Test Validity, Test Interpretation
Peer reviewed Peer reviewed
Direct linkDirect link
Black, Paul – Measurement: Interdisciplinary Research and Perspectives, 2012
Insofar as the title of this piece might call for a straightforward answer, it seems obvious that EPMA professionals are servants. Viewed in this perspective, Paul E. Newton's analysis is carefully balanced, in that it respects the complex history of the concerns of the professionals, whilst moving towards conclusions that place the needs of the…
Descriptors: Validity, Measurement, Test Results, Evaluation
Warner, Zachary B. – ProQuest LLC, 2013
This study compared an expert-based cognitive model of domain mastery with student-based cognitive models of task performance for Integrated Algebra. Interpretations of student test results are limited by experts' hypotheses of how students interact with the items. In reality, the cognitive processes that students use to solve each item may be…
Descriptors: Comparative Analysis, Algebra, Test Results, Measurement
Peer reviewed Peer reviewed
Direct linkDirect link
Madsen, Adrian; McKagan, Sarah B.; Martinuk, Mathew Sandy; Bell, Alexander; Sayre, Eleanor C. – Physical Review Physics Education Research, 2016
To help faculty use research-based materials in a more significant way, we learn about their perceived needs and desires and use this information to suggest ways for the physics education research community to address these needs. When research-based resources are well aligned with the perceived needs of faculty, faculty members will more readily…
Descriptors: Physics, Science Education, Science Teachers, College Faculty
Peer reviewed Peer reviewed
Direct linkDirect link
Camara, Wayne J.; Shaw, Emily J. – Educational Measurement: Issues and Practice, 2012
The measurement community needs to better understand how to interact with the media to effectively disseminate important findings from educational testing efforts. To this end, the current paper will review media coverage of educational testing and related issues and elaborate on areas of concern and opportunities for improved communication…
Descriptors: Test Results, Educational Testing, Measurement, Information Dissemination
Peer reviewed Peer reviewed
Direct linkDirect link
Hoadley, Ursula; Muller, Johan – Curriculum Journal, 2016
Why has large-scale standardised testing attracted such a bad press? Why has pedagogic benefit to be derived from test results been downplayed? The paper investigates this question by first surveying the pros and cons of testing in the literature, and goes on to examine educators' responses to standardised, large-scale tests in a sample of low…
Descriptors: Foreign Countries, Standardized Tests, Developing Nations, Visual Discrimination
Previous Page | Next Page ยป
Pages: 1  |  2  |  3  |  4  |  5  |  6  |  7  |  8  |  9  |  10  |  11