Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 0 |
Since 2006 (last 20 years) | 6 |
Descriptor
Comparative Analysis | 14 |
Testing Programs | 14 |
Test Reliability | 9 |
Educational Assessment | 5 |
Elementary Secondary Education | 5 |
Scores | 5 |
Test Validity | 5 |
Academic Achievement | 4 |
Achievement Tests | 4 |
State Programs | 4 |
Test Construction | 4 |
More ▼ |
Source
Author
Somers, Marie-Andree | 2 |
Wong, Edmond | 2 |
Zhu, Pei | 2 |
Breyer, F. Jay | 1 |
Carvajal, Jorge | 1 |
Haberman, Shelby | 1 |
Hacker, Jacob | 1 |
Hambleton, Ronald K. | 1 |
Hathaway, Walter | 1 |
Heldsinger, Sandra | 1 |
Humphry, Stephen | 1 |
More ▼ |
Publication Type
Reports - Evaluative | 11 |
Journal Articles | 7 |
Reports - Research | 3 |
Speeches/Meeting Papers | 3 |
Tests/Questionnaires | 2 |
Education Level
Elementary Secondary Education | 4 |
Audience
Location
United States | 2 |
Australia | 1 |
Laws, Policies, & Programs
Assessments and Surveys
National Assessment of… | 2 |
What Works Clearinghouse Rating
Zhu, Pei; Somers, Marie-Andree; Wong, Edmond – Society for Research on Educational Effectiveness, 2010
For this project, the authors use data from four IES-sponsored randomized studies to examine some of the key issues identified in May et. al. (2009). The first set of questions focuses on issues related to using state tests: (1) Do studies meet the assumptions needed for combining impacts on state tests across grades and/or states?; (2) How…
Descriptors: Academic Achievement, Program Effectiveness, State Standards, Testing Programs
Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013
In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…
Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests
Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010
This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…
Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores
Heldsinger, Sandra; Humphry, Stephen – Australian Educational Researcher, 2010
Demands for accountability have seen the implementation of large scale testing programs in Australia and internationally. There is, however, a growing body of evidence to show that externally imposed testing programs do not have a sustained impact on student achievement. It has been argued that teacher assessment is more effective in raising…
Descriptors: Testing Programs, Testing, Academic Achievement, Measures (Individuals)
Somers, Marie-Andree; Zhu, Pei; Wong, Edmond – National Center for Education Evaluation and Regional Assistance, 2011
This study examines the practical implications of using state tests to measure student achievement in impact evaluations that span multiple states and grades. In particular, the study examines the sensitivity of impact findings to (1) the type of assessment used to measured achievement (state tests or an external assessment administered by the…
Descriptors: Evaluators, Grades (Scholastic), Academic Achievement, Program Effectiveness
Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – ETS Research Report Series, 2006
This study addresses the sample error and linking bias that occur with small and unrepresentative samples in a non-equivalent groups anchor test (NEAT) design. We propose a linking method called the "synthetic function," which is a weighted average of the identity function (the trivial equating function for forms that are known to be…
Descriptors: Equated Scores, Sample Size, Test Items, Statistical Bias

Linn, Robert L.; Kiplinger, Vonda L. – Applied Measurement in Education, 1995
The adequacy of linking statewide standardized test results to the National Assessment of Educational Progress by using equipercentile equating procedures was investigated using statewide mathematics data from four states. Results suggest that the linkings are not sufficiently trustworthy to make comparisons based on the tails of the distribution.…
Descriptors: Comparative Analysis, Educational Assessment, Equated Scores, Mathematics Tests

Resnick, Lauren B. – American Journal of Education, 1994
Explores issues involved in using assessments to define standards and encourage efforts to meet them and compares the European examination system with the American testing system. Also considered are issues of the definition of learning domains in ways that do not encourage narrowly focused training on specific assessment items. (SLD)
Descriptors: Academic Achievement, Comparative Analysis, Definitions, Educational Assessment
Seyfarth, John T. – 1993
Performance based assessment refers to tasks that require students to construct responses or take actions to demonstrate specific knowledge or skills. Performance assessment tasks appear in a variety of formats, but they focus on higher order skills and are nonroutine, and sometimes loosely structured, in nature. A number of concerns have been…
Descriptors: Accountability, Comparative Analysis, Educational Assessment, Educational Change
Pollack, Judith M. – 1990
This paper summarizes an investigation of applications and issues in free response (FR) testing during 1989. It draws on ideas from the results of the National Educational Longitudinal Study 1988 (NELS:88) field test, a seminar series at the Educational Testing Service (ETS), working papers prepared for several FR testing applications, and…
Descriptors: Comparative Analysis, Costs, Educational Assessment, Elementary Secondary Education
Hacker, Jacob; Hathaway, Walter – 1991
Testing and assessment that are "more authentic" (performance-based or alternative) represent the most pressing issue in education today. Some of the major criticisms leveled at standardized testing are examined, and the advantages and disadvantages of more authentic assessment are reviewed. A general direction for integrating traditional and…
Descriptors: Comparative Analysis, Cost Effectiveness, Educational Assessment, Educational Trends
Hambleton, Ronald K.; Jones, Russell W. – 1992
The purpose of this study was to improve both statistical and judgmental methods for detecting potentially biased test items in an attempt to examine the agreement between the results obtained with these methods. If greater agreement between methods can be achieved, test items can be more effectively screened using judgmental methods prior to…
Descriptors: Achievement Tests, American Indians, Anglo Americans, Comparative Analysis
Silvestro, John R.; And Others – 1989
The job analysis procedures used in the development of the Illinois Certification Testing System are described. The degree of congruence between job analysis ratings provided by public school educators (PSEs) and teacher educators (TEs) who completed the job analysis surveys is examined. National Evaluation Systems, Inc., and the Illinois State…
Descriptors: Comparative Analysis, Content Analysis, Elementary Secondary Education, Interrater Reliability
Luecht, Richard M. – Journal of Applied Testing Technology, 2005
Computer-based testing (CBT) is typically implemented using one of three general test delivery models: (1) multiple fixed testing (MFT); (2) computer-adaptive testing (CAT); or (3) multistage testing (MSTs). This article reviews some of the real cost drivers associated with CBT implementation--focusing on item production costs, the costs…
Descriptors: Adaptive Testing, Computer Assisted Testing, Quality Control, Costs