Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 11 |
Descriptor
Source
Author
Dorans, Neil J. | 2 |
Egley, Robert J. | 2 |
Jones, Brett D. | 2 |
Koretz, Daniel | 2 |
Abu-Alhija, Fadia Nasser | 1 |
Airasian, Peter W. | 1 |
Alting, Annita | 1 |
Amrein, Audrey L. | 1 |
Angel, Dan | 1 |
Berliner, David C. | 1 |
Buckendahl, Chad W. | 1 |
More ▼ |
Publication Type
Education Level
Elementary Secondary Education | 5 |
Elementary Education | 2 |
Higher Education | 2 |
Adult Education | 1 |
High Schools | 1 |
Postsecondary Education | 1 |
Audience
Practitioners | 2 |
Researchers | 1 |
Location
Florida | 4 |
Kentucky | 2 |
New Jersey | 2 |
United States | 2 |
Vermont | 2 |
California | 1 |
Canada | 1 |
China | 1 |
Connecticut | 1 |
Israel | 1 |
Kansas | 1 |
More ▼ |
Laws, Policies, & Programs
No Child Left Behind Act 2001 | 2 |
Assessments and Surveys
What Works Clearinghouse Rating
Tannenbaum, Richard J.; Kane, Michael T. – ETS Research Report Series, 2019
Testing programs are often classified as high or low stakes to indicate how stringently they need to be evaluated. However, in practice, this classification falls short. A high-stakes label is taken to imply that all indicators of measurement quality must meet high standards; whereas a low-stakes label is taken to imply the opposite. This approach…
Descriptors: High Stakes Tests, Testing Programs, Measurement, Evaluation Criteria
Koretz, Daniel – Measurement: Interdisciplinary Research and Perspectives, 2013
Haertel's argument is that one must "expand the scope of test validation to include indirect testing effects" because these effects are often the "rationale for the entire testing program." The author strongly agrees that this is essential. However, he maintains that Haertel's argument does not go far enough and that there are two additional…
Descriptors: Educational Testing, Test Validity, Test Results, Testing Programs
Lane, Suzanne – Measurement: Interdisciplinary Research and Perspectives, 2012
Considering consequences in the evaluation of validity is not new although it is still debated by Paul E. Newton and others. The argument-based approach to validity entails an interpretative argument that explicitly identifies the proposed interpretations and uses of test scores and a validity argument that provides a structure for evaluating the…
Descriptors: Educational Opportunities, Accountability, Validity, Inferences
Sinharay, Sandip; Dorans, Neil J.; Liang, Longjuan – Educational Measurement: Issues and Practice, 2011
Over the past few decades, those who take tests in the United States have exhibited increasing diversity with respect to native language. Standard psychometric procedures for ensuring item and test fairness that have existed for some time were developed when test-taking groups were predominantly native English speakers. A better understanding of…
Descriptors: Test Bias, Testing Programs, Psychometrics, Language Proficiency
Moses, Tim; Liu, Jinghua; Tan, Adele; Deng, Weiling; Dorans, Neil J. – ETS Research Report Series, 2013
In this study, differential item functioning (DIF) methods utilizing 14 different matching variables were applied to assess DIF in the constructed-response (CR) items from 6 forms of 3 mixed-format tests. Results suggested that the methods might produce distinct patterns of DIF results for different tests and testing programs, in that the DIF…
Descriptors: Test Construction, Multiple Choice Tests, Test Items, Item Analysis
Davis, Susan L.; Buckendahl, Chad W.; Plake, Barbara S. – Journal of Educational Measurement, 2008
As an alternative to adaptation, tests may also be developed simultaneously in multiple languages. Although the items on such tests could vary substantially, scores from these tests may be used to make the same types of decisions about different groups of examinees. The ability to make such decisions is contingent upon setting performance…
Descriptors: Test Results, Testing Programs, Multilingualism, Standard Setting

Jones, Brett D.; Egley, Robert J. – ERS Spectrum, 2010
The purpose of this study was to determine how elementary school administrators in a large U.S. state perceived the overall effects of testing on education in general and, more specifically, on their instructional leadership responsibilities. We surveyed 325 Florida principals and assistant principals, many of whom viewed the testing program…
Descriptors: Assistant Principals, Test Results, Testing Programs, Testing
Ferrara, Steve; Perie, Marianne; Johnson, Eugene – Journal of Applied Testing Technology, 2008
Psychometricians continue to introduce new approaches to setting cut scores for educational assessments in an attempt to improve on current methods. In this paper we describe the Item-Descriptor (ID) Matching method, a method based on IRT item mapping. In ID Matching, test content area experts match items (i.e., their judgments about the knowledge…
Descriptors: Test Results, Test Content, Testing Programs, Educational Testing
Abu-Alhija, Fadia Nasser – Studies in Educational Evaluation, 2007
This article discusses the positive and negative consequences of large-scale testing on five key stakeholders of testing results: students, teachers, administrators, policymakers and parents. The factors that affect the nature of testing consequences are also discussed and means that may provide remedies for associated pitfalls are proposed.
Descriptors: Testing Programs, Measurement, Student Evaluation, Test Results

Guskey, Thomas R.; Kifer, Edward W. – Educational Measurement: Issues and Practice, 1990
How state educational authorities in Kentucky use statewide test data to rank the state's 178 school districts was studied, using data from the "Kentucky Essential Skills Test: Statewide Testing Results" (1987). The methods used, means of refining those methods, the fairness/accuracy/validity of resulting interpretations, and problems…
Descriptors: School Districts, School Effectiveness, State Programs, Test Results

Ercikan, Kadriye – Applied Measurement in Education, 1997
Linking scores from the National Assessment of Educational Progress (NAEP) to statewide test results was studied. Results based on an equipercentile procedure suggest that such a link does not provide precise information. Information from a linking study should be limited to rough estimates of students in each NAEP achievement level. (SLD)
Descriptors: Equated Scores, Estimation (Mathematics), National Surveys, State Programs

Amrein, Audrey L.; Berliner, David C. – Education Policy Analysis Archives, 2002
Studied 18 states with high-stakes testing to see if their programs were affecting student learning, analyzing results from additional tests covering some of the same domain as each state's own test. Findings suggest that in all but one case, student learning is indeterminate, remains at the same level, or actually decreases with the…
Descriptors: Academic Achievement, High Stakes Tests, Learning, State Programs
Goodman, Dean P.; Hambleton, Ronald K. – Applied Measurement in Education, 2004
A critical, but often neglected, component of any large-scale assessment program is the reporting of test results. In the past decade, a body of evidence has been compiled that raises concerns over the ways in which these results are reported to and understood by their intended audiences. In this study, current approaches for reporting…
Descriptors: Test Results, Student Evaluation, Scores, Testing Programs
Morante, Edward A.; And Others – Journal of Developmental & Remedial Education, 1984
Describes New Jersey's coordinated, statewide higher education effort involving mandatory testing, remediation, and evaluation. Reviews the program's history; the role of the New Jersey Basic Skills Council (NJBSC); the Basic Skills Placement Test, which focuses on reading comprehension, sentence sense, math computation, and elementary algebra;…
Descriptors: Basic Skills, College Students, Educational Testing, Postsecondary Education

Sage, James E. – Journal of Studies in Technical Careers, 1979
Inquiry and lecture/lab methods of instruction yield different results in conceptual and problem-solving tests, while no difference exists in factual tests. The author recommends that the inquiry method of teaching be refined so that it can become a practical classroom instructional strategy. (CT)
Descriptors: Classroom Techniques, Inquiry, Laboratory Training, Lecture Method