Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 4 |
Since 2016 (last 10 years) | 10 |
Since 2006 (last 20 years) | 16 |
Descriptor
Scoring | 17 |
Language Tests | 12 |
Interrater Reliability | 8 |
Foreign Countries | 6 |
Scores | 6 |
Second Language Learning | 6 |
Test Reliability | 6 |
Evaluators | 5 |
Item Response Theory | 5 |
Reliability | 5 |
Correlation | 4 |
More ▼ |
Source
Language Testing | 17 |
Author
Attali, Yigal | 1 |
Audeoud, Mireille | 1 |
August, Diane | 1 |
Batty, Aaron Olaf | 1 |
Brunfaut, Tineke | 1 |
Carey, Michael D. | 1 |
Carlo, Maria | 1 |
Chan, Stephanie W. Y. | 1 |
Chapelle, Carol A. | 1 |
Cheung, Wai Ming | 1 |
Chung, Yoo-Ree | 1 |
More ▼ |
Publication Type
Journal Articles | 17 |
Reports - Research | 15 |
Reports - Descriptive | 1 |
Reports - Evaluative | 1 |
Education Level
Elementary Education | 3 |
Secondary Education | 3 |
Early Childhood Education | 1 |
High Schools | 1 |
Higher Education | 1 |
Junior High Schools | 1 |
Kindergarten | 1 |
Middle Schools | 1 |
Postsecondary Education | 1 |
Primary Education | 1 |
Audience
Location
Netherlands | 3 |
China | 1 |
Colombia | 1 |
Hong Kong | 1 |
India | 1 |
South Korea | 1 |
Switzerland | 1 |
Laws, Policies, & Programs
Assessments and Surveys
Graduate Record Examinations | 1 |
Peabody Picture Vocabulary… | 1 |
What Works Clearinghouse Rating
Lestari, Santi B.; Brunfaut, Tineke – Language Testing, 2023
Assessing integrated reading-into-writing task performances is known to be challenging, and analytic rating scales have been found to better facilitate the scoring of these performances than other common types of rating scales. However, little is known about how specific operationalizations of the reading-into-writing construct in analytic rating…
Descriptors: Reading Writing Relationship, Writing Tests, Rating Scales, Writing Processes
Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021
Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Latifi, Syed; Gierl, Mark – Language Testing, 2021
An automated essay scoring (AES) program is a software system that uses techniques from corpus and computational linguistics and machine learning to grade essays. In this study, we aimed to describe and evaluate particular language features of Coh-Metrix for a novel AES program that would score junior and senior high school students' essays from…
Descriptors: Writing Evaluation, Computer Assisted Testing, Scoring, Essays
Olson, Daniel J. – Language Testing, 2023
Measuring language dominance, broadly defined as the relative strength of each of a bilingual's two languages, remains a crucial methodological issue in bilingualism research. While various methods have been proposed, the Bilingual Language Profile (BLP) has been one of the most widely used tools for measuring language dominance. While previous…
Descriptors: Bilingualism, Language Dominance, Native Language, Second Language Learning
Lin, Chih-Kai – Language Testing, 2017
Sparse-rated data are common in operational performance-based language tests, as an inevitable result of assigning examinee responses to a fraction of available raters. The current study investigates the precision of two generalizability-theory methods (i.e., the rating method and the subdividing method) specifically designed to accommodate the…
Descriptors: Data Analysis, Language Tests, Generalizability Theory, Accuracy
Wang, Zhen; Zechner, Klaus; Sun, Yu – Language Testing, 2018
As automated scoring systems for spoken responses are increasingly used in language assessments, testing organizations need to analyze their performance, as compared to human raters, across several dimensions, for example, on individual items or based on subgroups of test takers. In addition, there is a need in testing organizations to establish…
Descriptors: Automation, Scoring, Speech Tests, Language Tests
Haug, Tobias; Batty, Aaron Olaf; Venetz, Martin; Notter, Christa; Girard-Groeber, Simone; Knoch, Ute; Audeoud, Mireille – Language Testing, 2020
In this study we seek evidence of validity according to the socio-cognitive framework (Weir, 2005) for a new sentence repetition test (SRT) for young Deaf L1 Swiss German Sign Language (DSGS) users. SRTs have been developed for various purposes for both spoken and sign languages to assess language development in children. In order to address the…
Descriptors: Foreign Countries, Language Tests, Sentences, Repetition
Kleijn, Suzanne; Pander Maat, Henk; Sanders, Ted – Language Testing, 2019
Although there are many methods available for assessing text comprehension, the cloze test is not widely acknowledged as one of them. Critiques on cloze testing center on its supposedly limited ability to measure comprehension beyond the sentence. However, these critiques do not hold for all types of cloze tests; the particular configuration of a…
Descriptors: Cloze Procedure, Language Tests, Semantics, Scoring
Chan, Stephanie W. Y.; Cheung, Wai Ming; Huang, Yanli; Lam, Wai-Ip; Lin, Chin-Hsi – Language Testing, 2020
Demand for second-language (L2) Chinese education for kindergarteners has grown rapidly, but little is known about these kindergarteners' L2 skills, with existing studies focusing on school-age populations and alphabetic languages. Accordingly, we developed a six-subtest Chinese character acquisition assessment to measure L2 kindergarteners'…
Descriptors: Chinese, Second Language Learning, Second Language Instruction, Written Language
Attali, Yigal; Lewis, Will; Steier, Michael – Language Testing, 2013
Automated essay scoring can produce reliable scores that are highly correlated with human scores, but is limited in its evaluation of content and other higher-order aspects of writing. The increased use of automated essay scoring in high-stakes testing underscores the need for human scoring that is focused on higher-order aspects of writing. This…
Descriptors: Scoring, Essay Tests, Reliability, High Stakes Tests
Trace, Jonathan; Janssen, Gerriet; Meier, Valerie – Language Testing, 2017
Previous research in second language writing has shown that when scoring performance assessments even trained raters can exhibit significant differences in severity. When raters disagree, using discussion to try to reach a consensus is one popular form of score resolution, particularly in contexts with limited resources, as it does not require…
Descriptors: Performance Based Assessment, Second Language Learning, Scoring, Evaluators
Deygers, Bart; Van Gorp, Koen – Language Testing, 2015
Considering scoring validity as encompassing both reliable rating scale use and valid descriptor interpretation, this study reports on the validation of a CEFR-based scale that was co-constructed and used by novice raters. The research questions this paper wishes to answer are (a) whether it is possible to construct a CEFR-based rating scale with…
Descriptors: Rating Scales, Scoring, Validity, Interrater Reliability
Katzenberger, Irit; Meilijson, Sara – Language Testing, 2014
The Katzenberger Hebrew Language Assessment for Preschool Children (henceforth: the KHLA) is the first comprehensive, standardized language assessment tool developed in Hebrew specifically for older preschoolers (4;0-5;11 years). The KHLA is a norm-referenced, Hebrew specific assessment, based on well-established psycholinguistic principles, as…
Descriptors: Semitic Languages, Preschool Children, Language Impairments, Language Tests
Carey, Michael D.; Mannell, Robert H.; Dunn, Peter K. – Language Testing, 2011
This study investigated factors that could affect inter-examiner reliability in the pronunciation assessment component of speaking tests. We hypothesized that the rating of pronunciation is susceptible to variation in assessment due to the amount of exposure examiners have to nonnative English accents. An inter-rater variability analysis was…
Descriptors: Oral Language, Pronunciation, Phonology, Interlanguage
Goodwin, Amanda P.; Huggins, A. Corinne; Carlo, Maria; Malabonga, Valerie; Kenyon, Dorry; Louguit, Mohammed; August, Diane – Language Testing, 2012
This study describes the development and validation of the Extract the Base test (ETB), which assesses derivational morphological awareness. Scores on this test were validated for 580 monolingual students and 373 Spanish-speaking English language learners (ELLs) in third through fifth grade. As part of the validation of the internal structure,…
Descriptors: Reading Comprehension, Speech Communication, Second Language Learning, Scoring
Previous Page | Next Page ยป
Pages: 1 | 2