NotesFAQContact Us
Collection
Advanced
Search Tips
Audience
Laws, Policies, & Programs
Assessments and Surveys
Test of Written English1
What Works Clearinghouse Rating
Showing 1 to 15 of 16 results Save | Export
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Doewes, Afrizal; Kurdhi, Nughthoh Arfawi; Saxena, Akrati – International Educational Data Mining Society, 2023
Automated Essay Scoring (AES) tools aim to improve the efficiency and consistency of essay scoring by using machine learning algorithms. In the existing research work on this topic, most researchers agree that human-automated score agreement remains the benchmark for assessing the accuracy of machine-generated scores. To measure the performance of…
Descriptors: Essays, Writing Evaluation, Evaluators, Accuracy
Peer reviewed Peer reviewed
Direct linkDirect link
Yu-Tzu Chang; Ann Tai Choe; Daniel Holden; Daniel R. Isbell – Language Testing, 2024
In this Brief Report, we describe an evaluation of and revisions to a rubric adapted from the Jacobs et al.'s (1981) ESL COMPOSITION PROFILE, with four rubric categories and 20-point rating scales, in the context of an intensive English program writing placement test. Analysis of 4 years of rating data (2016-2021, including 434 essays) using…
Descriptors: Language Tests, Rating Scales, Second Language Learning, English (Second Language)
Walland, Emma – Research Matters, 2022
In this article, I report on examiners' views and experiences of using Pairwise Comparative Judgement (PCJ) and Rank Ordering (RO) as alternatives to traditional analytical marking for GCSE English Language essays. Fifteen GCSE English Language examiners took part in the study. After each had judged 100 pairs of essays using PCJ and eight packs of…
Descriptors: Essays, Grading, Writing Evaluation, Evaluators
Peer reviewed Peer reviewed
PDF on ERIC Download full text
Husain Abdulhay; Moussa Ahmadian – rEFLections, 2024
This study attempted to discern the factor structure of the achievement goal orientation and goal structure constructs across the domain-specific task of essay writing in an Iranian EFL context. A convenience sample of 116 public university learners participated in a single-session, in-class study of an essay writing sampling and an immediate…
Descriptors: Foreign Countries, Factor Structure, Goal Orientation, Factor Analysis
Peer reviewed Peer reviewed
Direct linkDirect link
Zhang, Xiuyuan – AERA Online Paper Repository, 2019
The main purpose of the study is to evaluate the qualities of human essay ratings for a large-scale assessment using Rasch measurement theory. Specifically, Many-Facet Rasch Measurement (MFRM) was utilized to examine the rating scale category structure and provide important information about interpretations of ratings in the large-scale…
Descriptors: Essays, Evaluators, Writing Evaluation, Reliability
Peer reviewed Peer reviewed
Direct linkDirect link
Ghanbari, Nasim; Barati, Hossein – Language Testing in Asia, 2020
The present study reports the process of development and validation of a rating scale in the Iranian EFL academic writing assessment context. To achieve this goal, the study was conducted in three distinct phases. Early in the study, the researcher interviewed a number of raters in different universities. Next, a questionnaire was developed based…
Descriptors: Rating Scales, Writing Evaluation, English for Academic Purposes, Second Language Learning
Peer reviewed Peer reviewed
Direct linkDirect link
Jeong, Heejeong – Language Testing in Asia, 2019
In writing assessment, finding a valid, reliable, and efficient scale is critical. Appropriate scales, increase rater reliability, and can also save time and money. This exploratory study compared the effects of a binary scale and an analytic scale across teacher raters and expert raters. The purpose of the study is to find out how different scale…
Descriptors: Writing Evaluation, English (Second Language), Second Language Learning, Second Language Instruction
Peer reviewed Peer reviewed
Direct linkDirect link
Kim, Susie – Language Assessment Quarterly, 2021
Recent research conducted as part of the English Profile identified grammatical criterial features that are characteristic of each Common European Framework of Reference (CEFR) proficiency level. The extent to which these criterial features attest in English learners of various first language backgrounds call for an empirical examination. In this…
Descriptors: Guidelines, Rating Scales, Second Language Learning, Second Language Instruction
Yun, Jiyeo – ProQuest LLC, 2017
Since researchers investigated automatic scoring systems in writing assessments, they have dealt with relationships between human and machine scoring, and then have suggested evaluation criteria for inter-rater agreement. The main purpose of my study is to investigate the magnitudes of and relationships among indices for inter-rater agreement used…
Descriptors: Interrater Reliability, Essays, Scoring, Evaluators
Peer reviewed Peer reviewed
Direct linkDirect link
Schaefer, Edward – Language Testing, 2008
The present study employed multi-faceted Rasch measurement (MFRM) to explore the rater bias patterns of native English-speaker (NES) raters when they rate EFL essays. Forty NES raters rated 40 essays written by female Japanese university students on a single topic adapted from the TOEFL Test of Written English (TWE). The essays were assessed using…
Descriptors: Writing Evaluation, Writing Tests, Program Effectiveness, Essays
Peer reviewed Peer reviewed
Direct linkDirect link
Barkaoui, Khaled – Assessing Writing, 2007
Educators often have to choose among different types of rating scales to assess second-language (L2) writing performance. There is little research, however, on how different rating scales affect rater performance. This study employed a mixed-method approach to investigate the effects of two different rating scales on EFL essay scores, rating…
Descriptors: Writing Evaluation, Writing Tests, Rating Scales, Essays
Henning, Grant – 1992
The psychometric characteristics of the Test of Written English (TWE) rating scale were explored. Rasch model scalar analysis methodology was employed with more than 4,000 scored essays across 2 elicitation prompts to gather information about the rating scale and rating process. Results suggested that the intervals between TWE scale steps were…
Descriptors: English (Second Language), Equated Scores, Essays, Interrater Reliability
Peer reviewed Peer reviewed
Polio, Charlene G. – Language Learning, 1997
Investigates the reliability of measures of linguistic accuracy in second language writing. The study uses a holistic scale, error-free T-units, and an error classification system on the essays of English-as-a-Second-Language students and discusses why disagreements arise within a rater and between raters. (24 references) (Author/CK)
Descriptors: College Students, English (Second Language), Error Analysis (Language), Error of Measurement
Longford, Nicholas T. – 1993
A model-based approach to rater reliability for essays read by multiple readers is presented. Variation of rater severity (between-rater variation) and rater inconsistency (within-rater variation) is considered in the presence of between-examinee variation. An additive variance component model is posited and the method of moments for its…
Descriptors: Educational Diagnosis, Error of Measurement, Essays, Estimation (Mathematics)
Biesbrock, Edieann – 1969
Written compositions of 200 second and third grade children were compared four times over a 2-year period to determine relationships of composition ability with intelligence and reading level. A global essay instrument developed at the University of Georgia was used for the comparisons to find out whether or not the instrument revealed any…
Descriptors: Age Differences, Elementary School Students, Essays, Grade 2
Previous Page | Next Page ยป
Pages: 1  |  2