ERIC - Search Results

Publication Date

In 2025	0
Since 2024	0
Since 2021 (last 5 years)	0
Since 2016 (last 10 years)	0
Since 2006 (last 20 years)	6

Descriptor

Comparative Analysis	14
Testing Programs	14
Test Reliability	9
Educational Assessment	5
Elementary Secondary Education	5
Scores	5
Test Validity	5
Academic Achievement	4
Achievement Tests	4
State Programs	4
Test Construction	4
Correlation	3
Equated Scores	3
Foreign Countries	3
Reliability	3
Standardized Tests	3
Statistical Analysis	3
Student Evaluation	3
Accountability	2
Computer Assisted Testing	2
Cost Effectiveness	2
Costs	2
Evaluators	2
Interrater Reliability	2
Licensing Examinations…	2
More ▼

Source

ETS Research Report Series	2
American Journal of Education	1
Applied Measurement in…	1
Australian Educational…	1
Educational and Psychological…	1
Journal of Applied Testing…	1
National Center for Education…	1
Society for Research on…	1

Publication Type

Reports - Evaluative	11
Journal Articles	7
Reports - Research	3
Speeches/Meeting Papers	3
Tests/Questionnaires	2

Education Level

Elementary Secondary Education

Audience

Location

United States	2
Australia	1

Laws, Policies, & Programs

Assessments and Surveys

National Assessment of…

What Works Clearinghouse Rating

Showing all 14 results Save | Export

Using State Tests vs. Study-Administered Tests to Measure Student Achievement: An Empirical Assessment Based on Four Recent Randomized Evaluations of Educational Interventions

Download full text

Zhu, Pei; Somers, Marie-Andree; Wong, Edmond – Society for Research on Educational Effectiveness, 2010

For this project, the authors use data from four IES-sponsored randomized studies to examine some of the key issues identified in May et. al. (2009). The first set of questions focuses on issues related to using state tests: (1) Do studies meet the assumptions needed for combining impacts on state tests across grades and/or states?; (2) How…

Descriptors: Academic Achievement, Program Effectiveness, State Standards, Testing Programs

Investigating the Suitability of Implementing the "e-rater"® Scoring Engine in a Large-Scale English Language Testing Program. Research Report. ETS RR-13-36

Peer reviewed
PDF on ERIC

Download full text

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian – ETS Research Report Series, 2013

In this research, we investigated the suitability of implementing "e-rater"® automated essay scoring in a high-stakes large-scale English language testing program. We examined the effectiveness of generic scoring and 2 variants of prompt-based scoring approaches. Effectiveness was evaluated on a number of dimensions, including agreement…

Descriptors: Computer Assisted Testing, Computer Software, Scoring, Language Tests

A Comparison of Approaches for Improving the Reliability of Objective Level Scores

Peer reviewed

Direct link

Skorupski, William P.; Carvajal, Jorge – Educational and Psychological Measurement, 2010

This study is an evaluation of the psychometric issues associated with estimating objective level scores, often referred to as "subscores." The article begins by introducing the concepts of reliability and validity for subscores from statewide achievement tests. These issues are discussed with reference to popular scaling techniques, classical…

Descriptors: Testing Programs, Test Validity, Achievement Tests, Scores

Using the Method of Pairwise Comparison to Obtain Reliable Teacher Assessments

Peer reviewed
PDF on ERIC

Download full text

Heldsinger, Sandra; Humphry, Stephen – Australian Educational Researcher, 2010

Demands for accountability have seen the implementation of large scale testing programs in Australia and internationally. There is, however, a growing body of evidence to show that externally imposed testing programs do not have a sustained impact on student achievement. It has been argued that teacher assessment is more effective in raising…

Descriptors: Testing Programs, Testing, Academic Achievement, Measures (Individuals)

Whether and How to Use State Tests to Measure Student Achievement in a Multi-State Randomized Experiment: An Empirical Assessment Based on Four Recent Evaluations. NCEE 2012-4015

Peer reviewed
PDF on ERIC

Download full text

Somers, Marie-Andree; Zhu, Pei; Wong, Edmond – National Center for Education Evaluation and Regional Assistance, 2011

This study examines the practical implications of using state tests to measure student achievement in impact evaluations that span multiple states and grades. In particular, the study examines the sensitivity of impact findings to (1) the type of assessment used to measured achievement (state tests or an external assessment administered by the…

Descriptors: Evaluators, Grades (Scholastic), Academic Achievement, Program Effectiveness

An Alternative to Equating with Small Samples in the Non-Equivalent Groups Anchor Test Design. Research Report. ETS RR-06-27

Peer reviewed
PDF on ERIC

Download full text

Kim, Sooyeon; von Davier, Alina A.; Haberman, Shelby – ETS Research Report Series, 2006

This study addresses the sample error and linking bias that occur with small and unrepresentative samples in a non-equivalent groups anchor test (NEAT) design. We propose a linking method called the "synthetic function," which is a weighted average of the identity function (the trivial equating function for forms that are known to be…

Descriptors: Equated Scores, Sample Size, Test Items, Statistical Bias

Linking Statewide Tests to the National Assessment of Educational Progress: Stability of Results.

Peer reviewed

Linn, Robert L.; Kiplinger, Vonda L. – Applied Measurement in Education, 1995

The adequacy of linking statewide standardized test results to the National Assessment of Educational Progress by using equipercentile equating procedures was investigated using statewide mathematics data from four states. Results suggest that the linkings are not sufficiently trustworthy to make comparisons based on the tails of the distribution.…

Descriptors: Comparative Analysis, Educational Assessment, Equated Scores, Mathematics Tests

Performance Puzzles.

Peer reviewed

Resnick, Lauren B. – American Journal of Education, 1994

Explores issues involved in using assessments to define standards and encourage efforts to meet them and compares the European examination system with the American testing system. Also considered are issues of the definition of learning domains in ways that do not encourage narrowly focused training on specific assessment items. (SLD)

Descriptors: Academic Achievement, Comparative Analysis, Definitions, Educational Assessment

Performance-Based Assessment: Questions and Answers.

Download full text

Seyfarth, John T. – 1993

Performance based assessment refers to tasks that require students to construct responses or take actions to demonstrate specific knowledge or skills. Performance assessment tasks appear in a variety of formats, but they focus on higher order skills and are nonroutine, and sometimes loosely structured, in nature. A number of concerns have been…

Descriptors: Accountability, Comparative Analysis, Educational Assessment, Educational Change

Some Issues in Free Response Testing.

Pollack, Judith M. – 1990

This paper summarizes an investigation of applications and issues in free response (FR) testing during 1989. It draws on ideas from the results of the National Educational Longitudinal Study 1988 (NELS:88) field test, a seminar series at the Educational Testing Service (ETS), working papers prepared for several FR testing applications, and…

Descriptors: Comparative Analysis, Costs, Educational Assessment, Elementary Secondary Education

Toward Extended Assessment: The Big Picture.

Download full text

Hacker, Jacob; Hathaway, Walter – 1991

Testing and assessment that are "more authentic" (performance-based or alternative) represent the most pressing issue in education today. Some of the major criticisms leveled at standardized testing are examined, and the advantages and disadvantages of more authentic assessment are reviewed. A general direction for integrating traditional and…

Descriptors: Comparative Analysis, Cost Effectiveness, Educational Assessment, Educational Trends

Comparison of Empirical and Judgmental Methods for Detecting Differential Item Functioning.

Download full text

Hambleton, Ronald K.; Jones, Russell W. – 1992

The purpose of this study was to improve both statistical and judgmental methods for detecting potentially biased test items in an attempt to examine the agreement between the results obtained with these methods. If greater agreement between methods can be achieved, test items can be more effectively screened using judgmental methods prior to…

Descriptors: Achievement Tests, American Indians, Anglo Americans, Comparative Analysis

Public School Educator and Teacher Educator Job Analysis Ratings of Certification Test Objectives.

Silvestro, John R.; And Others – 1989

The job analysis procedures used in the development of the Illinois Certification Testing System are described. The degree of congruence between job analysis ratings provided by public school educators (PSEs) and teacher educators (TEs) who completed the job analysis surveys is examined. National Evaluation Systems, Inc., and the Illinois State…

Descriptors: Comparative Analysis, Content Analysis, Elementary Secondary Education, Interrater Reliability

Some Useful Cost-Benefit Criteria for Evaluating Computer-Based Test Delivery Models and Systems

Peer reviewed

Direct link

Luecht, Richard M. – Journal of Applied Testing Technology, 2005

Computer-based testing (CBT) is typically implemented using one of three general test delivery models: (1) multiple fixed testing (MFT); (2) computer-adaptive testing (CAT); or (3) multistage testing (MSTs). This article reviews some of the real cost drivers associated with CBT implementation--focusing on item production costs, the costs…

Descriptors: Adaptive Testing, Computer Assisted Testing, Quality Control, Costs

Somers, Marie-Andree	2
Wong, Edmond	2
Zhu, Pei	2
Breyer, F. Jay	1
Carvajal, Jorge	1
Haberman, Shelby	1
Hacker, Jacob	1
Hambleton, Ronald K.	1
Hathaway, Walter	1
Heldsinger, Sandra	1
Humphry, Stephen	1
Jones, Russell W.	1
Kim, Sooyeon	1
Kiplinger, Vonda L.	1
Linn, Robert L.	1
Lorenz, Florian	1
Luecht, Richard M.	1
Pollack, Judith M.	1
Resnick, Lauren B.	1
Seyfarth, John T.	1
Silvestro, John R.	1
Skorupski, William P.	1
Zhang, Mo	1
von Davier, Alina A.	1
More ▼