Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 1 |
Since 2016 (last 10 years) | 1 |
Since 2006 (last 20 years) | 5 |
Descriptor
Models | 7 |
Foreign Countries | 5 |
Interrater Reliability | 4 |
Language Tests | 4 |
Second Language Learning | 3 |
Comparative Analysis | 2 |
Computer Software | 2 |
Correlation | 2 |
Expertise | 2 |
Scoring | 2 |
Test Reliability | 2 |
More ▼ |
Source
Language Testing | 7 |
Author
Bosker, Hans Rutger | 1 |
Gierl, Mark J. | 1 |
Granfeldt, Jonas | 1 |
Haug, Tobias | 1 |
Hsieh, Mingchuan | 1 |
Pinget, Anne-France | 1 |
Quené, Hugo | 1 |
Raatz, Ulrich | 1 |
Schoonen, Rob | 1 |
Shin, Jinnie | 1 |
de Jong, Nivja H. | 1 |
More ▼ |
Publication Type
Journal Articles | 7 |
Reports - Research | 6 |
Reports - Evaluative | 1 |
Education Level
Adult Education | 1 |
Elementary Education | 1 |
Grade 6 | 1 |
Higher Education | 1 |
Intermediate Grades | 1 |
Postsecondary Education | 1 |
Secondary Education | 1 |
Audience
Location
Netherlands | 2 |
Germany | 1 |
Sweden | 1 |
Taiwan | 1 |
United Kingdom | 1 |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Shin, Jinnie; Gierl, Mark J. – Language Testing, 2021
Automated essay scoring (AES) has emerged as a secondary or as a sole marker for many high-stakes educational assessments, in native and non-native testing, owing to remarkable advances in feature engineering using natural language processing, machine learning, and deep-neural algorithms. The purpose of this study is to compare the effectiveness…
Descriptors: Scoring, Essays, Writing Evaluation, Computer Software
Hsieh, Mingchuan – Language Testing, 2013
When implementing standard setting procedures, there are two major concerns: variance between panelists and efficiency in conducting multiple rounds of judgments. With regard to the former, there is concern over the consistency of the cutoff scores made by different panelists. If the cut scores show an inordinately wide range then further rounds…
Descriptors: Item Response Theory, Standard Setting (Scoring), Language Tests, English (Second Language)
Granfeldt, Jonas; Ågren, Malin – Language Testing, 2014
One core area of research in Second Language Acquisition is the identification and definition of developmental stages in different L2s. For L2 French, Bartning and Schlyter (2004) presented a model of six morphosyntactic stages of development in the shape of grammatical profiles. The model formed the basis for the computer program Direkt Profil…
Descriptors: Second Language Learning, Language Tests, French, Language Teachers
Haug, Tobias – Language Testing, 2012
Despite the current need for reliable and valid test instruments in different countries in order to monitor the sign language acquisition of deaf children, very few tests are commercially available that offer strong evidence for their psychometric properties. This mirrors the current state of affairs for many sign languages, where very little…
Descriptors: Evidence, Sign Language, Language Tests, Construct Validity
Pinget, Anne-France; Bosker, Hans Rutger; Quené, Hugo; de Jong, Nivja H. – Language Testing, 2014
Oral fluency and foreign accent distinguish L2 from L1 speech production. In language testing practices, both fluency and accent are usually assessed by raters. This study investigates what exactly native raters of fluency and accent take into account when judging L2. Our aim is to explore the relationship between objectively measured temporal,…
Descriptors: Native Speakers, Language Fluency, Suprasegmentals, Second Language Learning

Raatz, Ulrich – Language Testing, 1985
Argues that classical test theory cannot be used at the item level on "authentic" language tests. However, if the total score is derived by adding the scores of a number of different and independent parts, test reliability can be estimated. Suggests using the Classical Latent Additives model to examine test-part homogeneity. (Author/SED)
Descriptors: Item Analysis, Latent Trait Theory, Models, Second Language Learning

Schoonen, Rob; And Others – Language Testing, 1997
Reports on three studies conducted in the Netherlands about the reading reliability of lay and expert readers in rating content and language usage of students' writing performances in three kinds of writing assignments. Findings reveal that expert readers are more reliable in rating usage, whereas both lay and expert readers are reliable raters of…
Descriptors: Foreign Countries, Interrater Reliability, Language Usage, Models