Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 2 |
Since 2016 (last 10 years) | 5 |
Since 2006 (last 20 years) | 15 |
Descriptor
Scores | 93 |
Test Construction | 93 |
Test Results | 93 |
Achievement Tests | 33 |
Elementary Secondary Education | 30 |
Testing Programs | 28 |
Academic Achievement | 26 |
Test Interpretation | 26 |
Test Items | 23 |
State Programs | 20 |
Mathematics Achievement | 16 |
More ▼ |
Source
Author
Hambleton, Ronald K. | 2 |
Klein, Stephen P. | 2 |
Bergman, Lincoln | 1 |
Bock, R. Darrell | 1 |
Boyer, Michelle | 1 |
Calkins, Lucy | 1 |
Campbell, Jay R. | 1 |
Carroll, John B. | 1 |
Crisp, Geoffrey T. | 1 |
Cronbach, Lee J. | 1 |
Dean, Paul | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 4 |
Postsecondary Education | 4 |
Elementary Secondary Education | 2 |
Secondary Education | 2 |
Elementary Education | 1 |
Grade 11 | 1 |
Grade 12 | 1 |
Grade 9 | 1 |
High Schools | 1 |
Audience
Practitioners | 13 |
Teachers | 11 |
Administrators | 5 |
Parents | 4 |
Community | 2 |
Researchers | 2 |
Counselors | 1 |
Policymakers | 1 |
Location
Florida | 8 |
Delaware | 7 |
California | 3 |
Oklahoma | 3 |
Connecticut | 2 |
Sweden | 2 |
Australia | 1 |
Canada | 1 |
Colorado | 1 |
Federated States of Micronesia | 1 |
Hawaii | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Student, Sanford R.; Gong, Brian – Educational Measurement: Issues and Practice, 2022
We address two persistent challenges in large-scale assessments of the Next Generation Science Standards: (a) the validity of score interpretations that target the standards broadly and (b) how to structure claims for assessments of this complex domain. The NGSS pose a particular challenge for specifying claims about students that evidence from…
Descriptors: Science Tests, Test Validity, Test Items, Test Construction
Designing Computer-Based Tests: Design Guidelines from Multimedia Learning Studied with Eye Tracking
Dirkx, K. J. H.; Skuballa, I.; Manastirean-Zijlstra, C. S.; Jarodzka, H. – Instructional Science: An International Journal of the Learning Sciences, 2021
The use of computer-based tests (CBTs), for both formative and summative purposes, has greatly increased over the past years. One major advantage of CBTs is the easy integration of multimedia. It is unclear, though, how to design such CBT environments with multimedia. The purpose of the current study was to examine whether guidelines for designing…
Descriptors: Test Construction, Computer Assisted Testing, Multimedia Instruction, Eye Movements
Keng, Leslie; Boyer, Michelle – National Center for the Improvement of Educational Assessment, 2020
ACT requested assistance from the National Center for the Improvement of Educational Assessment (Center for Assessment) to investigate declines of scores for states administering the ACT to its 11th grade students in 2018. This request emerged from conversations among state leaders, the Center for Assessment, and ACT in trying to understand the…
Descriptors: College Entrance Examinations, Scores, Test Score Decline, Educational Trends
Tengberg, Michael – Language Assessment Quarterly, 2018
Reading comprehension is often treated as a multidimensional construct. In many reading tests, items are distributed over reading process categories to represent the subskills expected to constitute comprehension. This study explores (a) the extent to which specified subskills of reading comprehension tests are conceptually conceivable to…
Descriptors: Reading Tests, Reading Comprehension, Scores, Test Results
OECD Publishing, 2019
The OECD Programme for International Student Assessment (PISA) examines what students know in reading, mathematics and science, and what they can do with what they know. It provides the most comprehensive and rigorous international assessment of student learning outcomes to date. Results from PISA indicate the quality and equity of learning…
Descriptors: Test Results, Achievement Tests, Foreign Countries, International Assessment
Zenisky, April L.; Hambleton, Ronald K. – Educational Measurement: Issues and Practice, 2012
Test scores matter these days. Test-takers want to understand how they performed, and test score reports, particularly those for individual examinees, are the vehicles by which most people get the bulk of this information. Historically, score reports have not always met the examinees' information or usability needs, but this is clearly changing…
Descriptors: Scores, Psychometrics, Test Results, Usability
Warner, Zachary B. – ProQuest LLC, 2013
This study compared an expert-based cognitive model of domain mastery with student-based cognitive models of task performance for Integrated Algebra. Interpretations of student test results are limited by experts' hypotheses of how students interact with the items. In reality, the cognitive processes that students use to solve each item may be…
Descriptors: Comparative Analysis, Algebra, Test Results, Measurement
Hooker, Giles; Finkelman, Matthew – Psychometrika, 2010
Hooker, Finkelman, and Schwartzman ("Psychometrika," 2009, in press) defined a paradoxical result as the attainment of a higher test score by changing answers from correct to incorrect and demonstrated that such results are unavoidable for maximum likelihood estimates in multidimensional item response theory. The potential for these results to…
Descriptors: Models, Scores, Item Response Theory, Psychometrics
Jorgenson, Olaf – Principal, 2012
To achieve perpetually better test results each year as mandated by the No Child Left Behind Act (NCLB), teachers in successful schools such as Leroy Anderson Elementary in San Jose, California, will "try anything" to raise scores, as the school's principal stated in an interview with "The San Jose Mercury News." In schools…
Descriptors: Academic Achievement, Testing, Teaching Methods, Standardized Tests
Moses, Tim; Liu, Jinghua; Tan, Adele; Deng, Weiling; Dorans, Neil J. – ETS Research Report Series, 2013
In this study, differential item functioning (DIF) methods utilizing 14 different matching variables were applied to assess DIF in the constructed-response (CR) items from 6 forms of 3 mixed-format tests. Results suggested that the methods might produce distinct patterns of DIF results for different tests and testing programs, in that the DIF…
Descriptors: Test Construction, Multiple Choice Tests, Test Items, Item Analysis
Froman, Terry – Research Services, Miami-Dade County Public Schools, 2010
The release of School Grades by the State of Florida in the late summer of 2010 created quite a stir. Over 500 elementary schools all over the state dropped at least one grade level. Among those schools that experienced drops, the Reading Gains and Mathematics Gains components seemed especially low. Some of the historically best performing schools…
Descriptors: Test Results, Scores, Achievement Tests, Achievement Gains
Mueller Gathercole, Virginia C.; Thomas, Enlli Mon; Hughes, Emma – International Journal of Bilingual Education and Bilingualism, 2008
The purpose of this paper is to propose an applied model for the assessment of bilingual children's language abilities with standardised tests. We discuss the purposes of such tests, especially in relation to vocabulary knowledge, and potential applications of test results for each of those purposes. The specific case to be examined here is that…
Descriptors: Test Results, Language Tests, Monolingualism, Vocabulary Development
Klinger, Don A; Rogers, W. Todd – Assessment in Education: Principles, Policy and Practice, 2006
Driven largely by calls for accountability, the use of large-scale testing is expanding in terms of the number and purposes of testing programmes. At the same time, financial constraints have resulted in attempts to reduce the lengths of such examinations. An examination of the 1994/1995 and 1995/1996 British Columbia Scholarship programme…
Descriptors: High Stakes Tests, Educational Change, Foreign Countries, Scholarships

Wainer, Howard; Sheehan, Kathleen M.; Wang, Xiaohui – Journal of Educational Measurement, 2000
Describes an analytic method for aiding in the generation of subscores that characterize the deep structure of tests and derives a procedure for estimating scores for those scales that are more statistically stable than subscores composed solely of items contained on that scale. Used data from a Praxis administration (9,278 examinees) to show the…
Descriptors: Estimation (Mathematics), Measures (Individuals), Scores, Teacher Evaluation

Wolfe, Edward W.; Gitomer, Drew H. – Applied Measurement in Education, 2001
Attempted to improve the measurement quality of a complex performance assessment through principled assessment design using the example of the National Board for Professional Teaching Standards Early Childhood/Generalist examination. All indexes examined improved after revisions were made. Results show the importance of attention to assessment…
Descriptors: Change, Performance Based Assessment, Psychometrics, Scores