Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 0 |
Since 2016 (last 10 years) | 3 |
Since 2006 (last 20 years) | 6 |
Descriptor
Evaluation Methods | 13 |
Scores | 13 |
Test Content | 13 |
Test Validity | 5 |
Achievement Tests | 4 |
Comparative Analysis | 4 |
Student Evaluation | 4 |
Test Construction | 4 |
Test Items | 4 |
Test Reliability | 4 |
Test Results | 4 |
More ▼ |
Source
Author
Arjoon, Janelle A. | 1 |
Bauer, Scott C. | 1 |
Breland, Hunter M. | 1 |
Brown, Richard S. | 1 |
Buser, Karen | 1 |
Cahn, Miriam F. | 1 |
Coughlin, Ed | 1 |
Dadey, Nathan | 1 |
Davis-Becker, Susan L. | 1 |
DePascale, Charles | 1 |
Dorans, Neil J. | 1 |
More ▼ |
Publication Type
Education Level
Elementary Education | 2 |
Elementary Secondary Education | 2 |
Adult Basic Education | 1 |
Adult Education | 1 |
Higher Education | 1 |
Middle Schools | 1 |
Postsecondary Education | 1 |
Audience
Teachers | 2 |
Practitioners | 1 |
Laws, Policies, & Programs
Every Student Succeeds Act… | 1 |
Assessments and Surveys
Florida Comprehensive… | 1 |
Measures of Academic Progress | 1 |
National Assessment of… | 1 |
SAT (College Admission Test) | 1 |
Test of Adult Basic Education | 1 |
Wide Range Achievement Test | 1 |
Woodcock Johnson Tests of… | 1 |
What Works Clearinghouse Rating
Wolkowitz, Amanda A.; Davis-Becker, Susan L.; Gerrow, Jack D. – Journal of Applied Testing Technology, 2016
The purpose of this study was to investigate the impact of a cheating prevention strategy employed for a professional credentialing exam that involved releasing over 7,000 active and retired exam items. This study evaluated: 1) If any significant differences existed between examinee performance on released versus non-released items; 2) If item…
Descriptors: Cheating, Test Content, Test Items, Foreign Countries
Dadey, Nathan; Lyons, Susan; DePascale, Charles – Applied Measurement in Education, 2018
Evidence of comparability is generally needed whenever there are variations in the conditions of an assessment administration, including variations introduced by the administration of an assessment on multiple digital devices (e.g., tablet, laptop, desktop). This article is meant to provide a comprehensive examination of issues relevant to the…
Descriptors: Evaluation Methods, Computer Assisted Testing, Educational Technology, Technology Uses in Education
Sanders, Sara – National Technical Assistance Center for the Education of Neglected or Delinquent Children and Youth (NDTAC), 2019
This guide is designed to assist States, agencies, and/or facilities who work with youth who are neglected, delinquent, or at-risk (N or D). The information in the guide will benefit those who are (a) interested in implementing pre-posttests, (b) in the process of identifying an appropriate pre-posttest, or (c) ready to evaluate current testing…
Descriptors: At Risk Students, Delinquency, Pretests Posttests, Testing
Arjoon, Janelle A.; Xu, Xiaoying; Lewis, Jennifer E. – Journal of Chemical Education, 2013
Many of the instruments developed for research use by the chemistry
education community are relatively new. Because psychometric evidence dictates the validity of interpretations made from test scores, gathering and reporting validity and reliability evidence is of utmost importance. Therefore, the purpose of this study was to investigate what…
Descriptors: Science Instruction, Measurement Techniques, Psychometrics, Evidence
Brown, Richard S.; Coughlin, Ed – Regional Educational Laboratory Mid-Atlantic, 2007
This report examines the availability and quality of predictive validity data for a selection of benchmark assessments identified by state and district personnel as in use within Mid-Atlantic Region jurisdictions. Based on a review of practices within the school districts in the region, this report details the benchmark assessments being used, in…
Descriptors: Test Content, Academic Achievement, Predictive Validity, Program Effectiveness
Liu, Jinghua; Cahn, Miriam F.; Dorans, Neil J. – Journal of Educational Measurement, 2006
The College Board's SAT[R] data are used to illustrate how the score equity assessment (SEA) can help inform the program about equatability. SEA is used to examine whether the content change(s) to the revised new SAT result in differential linking functions across gender groups. Results of population sensitivity analyses are reported on the…
Descriptors: Aptitude Tests, Comparative Analysis, Gender Differences, Scores

Swanson, David B.; And Others – Academic Medicine, 1990
This study is the National Board of Medical Examiners exploration of content-based techniques (standard-setting techniques in which pass/fail decisions are based upon the performance of examinees in relation to test content). Two content-based techniques (Angoff and Ebel) and three methods of evaluating examinee performance were studied. (MLW)
Descriptors: Content Validity, Evaluation Methods, Higher Education, Medical Education

Bauer, Scott C. – Education Policy Analysis Archives, 2000
Studied whether student scores on standardized tests represent reasonable measures of instructional quality using ratings by 10 parents and 11 educators (school principals) of the degree to which test items from a nationally marketed standardized achievement test represent the content actually taught. On average, raters felt that test items…
Descriptors: Achievement Tests, Educational Quality, Elementary Secondary Education, Evaluation Methods
Breland, Hunter M.; And Others – 1995
Brief, impromptu essays written for the 1990 administration of the College Board's English Composition Test (ECT) were randomly sampled for four groups of examinees. These essays were subjected to further holistic ratings beyond those conducted for the ECT, and analytical ratings were also obtained. The holistic scores were correlated with the…
Descriptors: Cohesion (Written Composition), English, Essays, Evaluation Methods
Buser, Karen – 1996
Most seasoned test developers recognize the importance of thoughtful decision making when constructing a test. Unfortunately, many classroom achievement tests are created by novice test developed who have not received sufficient instruction in item writing (G. Gulliksen, 1986; R. J. Stiggins, 1991). The result is often a test that is poorly…
Descriptors: Achievement Tests, Decision Making, Educational Planning, Evaluation Methods
Ory, John C.; Ryan, Katherine E. – 1993
This book for college faculty provides a resource for developing, using, and grading classroom exams. The first chapter addresses ways to determine what content should be included on an exam. The second chapter identifies testing considerations such as number of exams, difficulty level of items, and test length. Chapters 3 and 4 provide guidelines…
Descriptors: Classroom Techniques, Codes of Ethics, Essay Tests, Evaluation Methods
Palmer, Adrian – 1991
A discussion of second language program evaluation focuses on the interpretability of test scores as a criterion in program evaluation. It looks at both test design and research design issues. First, eight method-comparison, program evaluation studies that compare acquisition-based and analysis/practice based methods are described. Acquisition…
Descriptors: Comparative Analysis, Criterion Referenced Tests, Evaluation Criteria, Evaluation Methods
Jones, Brett D.; Egley, Robert J. – ERS Spectrum, 2005
The purpose of this paper is to discuss Florida teachers' recommendations for improving the Florida Comprehensive Assessment Test (FCAT) and to compare their recommendations with those of Florida administrators. Although teachers' suggestions varied as to the types and extent of remedies needed to improve the FCAT, some common themes emerged. The…
Descriptors: Test Results, Core Curriculum, Student Evaluation, Accountability