Publication Date
In 2025 | 0 |
Since 2024 | 1 |
Since 2021 (last 5 years) | 3 |
Since 2016 (last 10 years) | 11 |
Since 2006 (last 20 years) | 98 |
Descriptor
Rating Scales | 188 |
Test Reliability | 96 |
Test Validity | 79 |
Interrater Reliability | 60 |
Evaluation Methods | 47 |
Reliability | 43 |
Psychometrics | 42 |
Correlation | 35 |
Test Construction | 35 |
Validity | 32 |
Foreign Countries | 28 |
More ▼ |
Source
Author
Epstein, Michael H. | 3 |
Knoch, Ute | 3 |
Elder, Catherine | 2 |
Houston, Walter M. | 2 |
Lee, Donghyuck | 2 |
Matson, Johnny L. | 2 |
Raymond, Mark R. | 2 |
Ritvo, Riva Ariella | 2 |
Shaughnessy, Michael F. | 2 |
Adcock, Amy B. | 1 |
Adler, Lenard A. | 1 |
More ▼ |
Publication Type
Education Level
Higher Education | 17 |
Elementary Secondary Education | 12 |
Postsecondary Education | 9 |
Elementary Education | 7 |
Early Childhood Education | 3 |
Grade 1 | 3 |
Grade 2 | 3 |
Grade 3 | 3 |
Grade 4 | 3 |
Grade 5 | 3 |
High Schools | 3 |
More ▼ |
Audience
Practitioners | 6 |
Researchers | 4 |
Teachers | 2 |
Administrators | 1 |
Location
China | 3 |
Hong Kong | 3 |
United States | 3 |
Australia | 2 |
Canada | 2 |
Florida | 2 |
Louisiana | 2 |
South Africa | 2 |
Africa | 1 |
Arizona | 1 |
Brazil | 1 |
More ▼ |
Laws, Policies, & Programs
Improving Americas Schools… | 1 |
Individuals with Disabilities… | 1 |
No Child Left Behind Act 2001 | 1 |
Assessments and Surveys
What Works Clearinghouse Rating
Mansi Wadhwa; Jingwen Zheng; Thomas D. Cook – Review of Educational Research, 2024
Clearinghouses set standards of scientific quality to vet existing research to determine how "evidence-based" an intervention is. This paper examines 12 educational clearinghouses to describe their effectiveness criteria, to estimate how consistently[underlined] they rate the same program, and to probe why their judgments differ. All the…
Descriptors: Clearinghouses, Standards, Evaluation Criteria, Reliability
Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023
Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…
Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy
Knoch, Ute; Deygers, Bart; Khamboonruang, Apichat – Language Testing, 2021
Rating scale development in the field of language assessment is often considered in dichotomous ways: It is assumed to be guided either by expert intuition or by drawing on performance data. Even though quite a few authors have argued that rating scale development is rarely so easily classifiable, this dyadic view has dominated language testing…
Descriptors: Rating Scales, Test Construction, Language Tests, Test Use
Knoch, Ute; Chapelle, Carol A. – Language Testing, 2018
Argument-based validation requires test developers and researchers to specify what is entailed in test interpretation and use. Doing so has been shown to yield advantages (Chapelle, Enright, & Jamieson, 2010), but it also requires an analysis of how the concerns of language testers can be conceptualized in the terms used to construct a…
Descriptors: Test Validity, Language Tests, Evaluation Research, Rating Scales
Montes, Guillermo; Reynolds Weber, Melissa; Infurna, Charles; Van Wagner, Genemarie; Zimmer, Ariana; Hightower, A. Dirk – European Early Childhood Education Research Journal, 2018
The aim of this article was to independently test the factor structure of the Early Childhood Environment Rating Scale, 3rd Edition (ECERS-3). Using a sample of 148 independent observations, Standard and Satorra--Bentler confirmatory factor analysis were used to determine if the ECERS-3 conformed to the structure published in its manual using a…
Descriptors: Factor Structure, Early Childhood Education, Educational Environment, Rating Scales
Reutzel, D. Ray; Mohr, Kathleen A. J. – Literacy Research and Instruction, 2014
In this response to "Measuring Students' Writing Ability on a Computer Analytic Developmental Scale: An Exploratory Validity Study," the authors agree that assessments should seek parsimony in both theory and application wherever possible. Doing so allows maximal dissemination and implementation while minimizing costs. The Writing…
Descriptors: Writing Ability, Discovery Processes, Rating Scales, Construct Validity
Karren, Benjamin C. – Journal of Psychoeducational Assessment, 2017
The Gilliam Autism Rating Scale-Third Edition (GARS-3) is a norm-referenced tool designed to screen for autism spectrum disorders (ASD) in individuals between the ages of 3 and 22 (Gilliam, 2014). The GARS-3 test kit consists of three different components and includes an "Examiner's Manual," summary/response forms (50), and the…
Descriptors: Autism, Pervasive Developmental Disorders, Rating Scales, Norm Referenced Tests
Isbell, Dan; Winke, Paula – Language Testing, 2019
The American Council on the Teaching of Foreign Languages (ACTFL) oral proficiency interview -- computer (OPIc) testing system represents an ambitious effort in language assessment: Assessing oral proficiency in over a dozen languages, on the same scale, from virtually anywhere at any time. Especially for users in contexts where multiple foreign…
Descriptors: Oral Language, Language Tests, Language Proficiency, Second Language Learning
Min, Shangchao; He, Lianzhen; Zhang, Jie – Language Teaching, 2020
This article reviews a selected sample of 70 empirical studies in journal articles and doctoral dissertations on language assessment in China between 2011 and 2018. Following a brief introduction to the history and current state of language assessment in China, the article presents a critical review of language assessment research on six themes…
Descriptors: Language Tests, Test Reliability, Test Validity, Journal Articles
Fenwick, Melanie; McCrimmon, Adam W. – Canadian Journal of School Psychology, 2015
This article provides a description and review of the "Comprehensive Executive Function Inventory" (CEFI; Naglieri & Goldstein, 2013), published by Multi-Health Systems Inc. (MHS). It is a rating scale developed to measure a wide array of Executive Function (EF) abilities in individuals aged 5 through 18 years. Completed by a parent,…
Descriptors: Executive Function, Rating Scales, Cognitive Tests, Children
Perdue, Elizabeth A. – Journal of Psychoeducational Assessment, 2016
The "Attention-Deficit/Hyperactivity Disorder Test-Second Edition" (ADHDT-2) is published through Pro-Ed in Austin, Texas. It was formally published in 2014, following critical revisions of the ADHDT, the reportedly popular initial version of this test that was published in 1995. The ADHDT-2 purports to act as a screener for individuals…
Descriptors: Attention Deficit Hyperactivity Disorder, Tests, Screening Tests, Symptoms (Individual Disorders)
Gresham, Frank M. – Cambridge Journal of Education, 2016
Children and youth with deficits in social competence present substantial challenges for schools, teachers, parents and peers. These challenges cut across disciplinary, instructional and interpersonal domains and they frequently create chaotic home, school and classroom environments. Schools are charged with teaching an increasingly diverse…
Descriptors: Interpersonal Competence, Student Evaluation, Children, Youth
McGill, Ryan J. – Journal of Psychoeducational Assessment, 2013
The Children's Psychological Processing Scale (CPPS), authored by Milton J. Dehn and published by Schoolhouse Educational Services in 2012, is a third-party rating scale that can be administered to teachers who are familiar with children ages 5 to 12. The measure is designed to identify psychological processing deficits in children who are…
Descriptors: Rating Scales, Teachers, Children, Learning Disabilities
Cheng, Liying; Fox, Janna – Language Teaching, 2013
This paper reviews a selected sample of 24 doctoral dissertations in language assessment (broadly defined), completed between 2006 and 2011 in Canadian universities. These dissertations fall into five thematic categories: 1) reliability, validity and factors affecting test performance; 2) washback (impact) and ethics; 3) raters, rating and rating…
Descriptors: Foreign Countries, Doctoral Dissertations, Mixed Methods Research, Language Research
Gustafsson, Jan-Eric; Erickson, Gudrun – Educational Assessment, Evaluation and Accountability, 2013
In the Swedish educational system, teachers have the dual responsibility of assigning final grades and marking their own students' national tests. The Government has mandated the Swedish Schools Inspectorate to remark samples of the national tests to see if teacher marking can be trusted. Reports from this project have concluded that intermarker…
Descriptors: Logical Thinking, Student Evaluation, Inferences, Trust (Psychology)