Publication Date
In 2025 | 0 |
Since 2024 | 0 |
Since 2021 (last 5 years) | 5 |
Since 2016 (last 10 years) | 23 |
Since 2006 (last 20 years) | 48 |
Descriptor
Source
Author
Lee, Yong-Won | 7 |
Kantor, Robert | 5 |
Papageorgiou, Spiros | 5 |
Mollaun, Pam | 4 |
Davis, Larry | 3 |
Henning, Grant | 3 |
Xi, Xiaoming | 3 |
Attali, Yigal | 2 |
Bridgeman, Brent | 2 |
Burstein, Jill | 2 |
Carlson, Sybil B. | 2 |
More ▼ |
Publication Type
Education Level
Higher Education | 20 |
Postsecondary Education | 17 |
Secondary Education | 5 |
High Schools | 3 |
Elementary Education | 2 |
Grade 12 | 2 |
Grade 10 | 1 |
Grade 11 | 1 |
Grade 6 | 1 |
Grade 7 | 1 |
Grade 8 | 1 |
More ▼ |
Audience
Researchers | 2 |
Location
Iran | 9 |
Germany | 3 |
Japan | 3 |
Canada | 2 |
China | 2 |
India | 2 |
Mexico | 2 |
United States | 2 |
Australia | 1 |
Colombia | 1 |
Dominican Republic | 1 |
More ▼ |
Laws, Policies, & Programs
Assessments and Surveys
What Works Clearinghouse Rating
Davis, Larry; Papageorgiou, Spiros – Assessment in Education: Principles, Policy & Practice, 2021
Human raters and machine scoring systems potentially have complementary strengths in evaluating language ability; specifically, it has been suggested that automated systems might be used to make consistent measurements of specific linguistic phenomena, whilst humans evaluate more global aspects of performance. We report on an empirical study that…
Descriptors: Scoring, English for Academic Purposes, Oral English, Speech Tests
Tavarez Da Costa, Pedro; Reyes Arias, Fransheska – Online Submission, 2021
The present work seeks to establish a comparison between two different and distant evaluation tools applied to the Dominican student population in order to measure the efficiency of our educational system in the recent years, one of them measured the quality of Dominican education in three areas (the PISA Test), whereas the other tested the…
Descriptors: Foreign Countries, Standardized Tests, Student Evaluation, International Assessment
Kermad, Alyssa; Bogorevich, Valeria – Language Teaching Research Quarterly, 2022
The practice of second language (L2) speech perception has traditionally relied on equal-interval perceptual scales and novice listeners' (NLs) impressionistic judgments of constructs such as accentedness and comprehensibility (Munro & Derwing, 2011). However, issues have surfaced with respect to how well NLs can use these scales, whether they…
Descriptors: Speech Communication, Second Language Learning, Intelligibility, Rating Scales
Baghaei, Samira; Bagheri, Mohammad Sadegh; Yamini, Mortaza – Cogent Education, 2020
The main purpose of this quantitative-qualitative content analysis study was to compare IELTS and TOEFL listening and reading tests based on the representation of the learning objectives of Revised Bloom's taxonomy. To this end, 12 Academic IELTS listening and reading tests and 12 TOEFL iBT listening and reading tests were analyzed qualitatively…
Descriptors: Second Language Learning, English (Second Language), Language Tests, Reading Tests
Iberri-Shea, Gina – Cogent Education, 2017
Prominent spoken language assessments such as the Oral Proficiency Interview and the Test of Spoken English have been primarily concerned with speaking ability as it relates to conversation. This paper looks at an additional aspect of spoken language ability, namely public speaking. This study used an adapted form of a public speaking rating scale…
Descriptors: Public Speaking, Rating Scales, Adoption (Ideas), English Instruction
Ahmadi, Alireza – Taiwan Journal of TESOL, 2020
Rater subjectivity has long been an intriguing topic. The use of discussion as a resolution method is a practical way to reduce this subjectivity. However, the efficacy of discussion depends on whether different raters get equally engaged in it or one rater tends to dominate others. This study investigated whether and how rater dominance occurs in…
Descriptors: Evaluators, Interrater Reliability, Discussion, Discourse Analysis
Davis, Larry; Norris, John – ETS Research Report Series, 2021
The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The "TOEFL® Essentials"™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the…
Descriptors: Task Analysis, Language Proficiency, Speech Communication, Language Tests
Toroujeni, Seyyed Morteza Hashemi – Education and Information Technologies, 2022
Score interchangeability of Computerized Fixed-Length Linear Testing (henceforth CFLT) and Paper-and-Pencil-Based Testing (henceforth PPBT) has become a controversial issue over the last decade when technology has meaningfully restructured methods of the educational assessment. Given this controversy, various testing guidelines published on…
Descriptors: Computer Assisted Testing, Reading Tests, Reading Comprehension, Scoring
Rupp, André A.; Casabianca, Jodi M.; Krüger, Maleika; Keller, Stefan; Köller, Olaf – ETS Research Report Series, 2019
In this research report, we describe the design and empirical findings for a large-scale study of essay writing ability with approximately 2,500 high school students in Germany and Switzerland on the basis of 2 tasks with 2 associated prompts, each from a standardized writing assessment whose scoring involved both human and automated components.…
Descriptors: Automation, Foreign Countries, English (Second Language), Language Tests
Ahmadi Shirazi, Masoumeh – SAGE Open, 2019
Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…
Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests
Adding Value to Second-Language Listening and Reading Subscores: Using a Score Augmentation Approach
Papageorgiou, Spiros; Choi, Ikkyu – International Journal of Testing, 2018
This study examined whether reporting subscores for groups of items within a test section assessing a second-language modality (specifically reading or listening comprehension) added value from a measurement perspective to the information already provided by the section scores. We analyzed the responses of 116,489 test takers to reading and…
Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Language Tests
Papageorgiou, Spiros; Xi, Xiaoming; Morgan, Rick; So, Youngsoon – Language Assessment Quarterly, 2015
This study presents the development and empirical validation of score levels and descriptors specifically designed for reporting purposes to provide test takers with more than just a number on a score scale. In the context of a test primarily intended for 11- to 15-year-old students learning English as a second/foreign language, the study examined…
Descriptors: Scores, Validity, Scaling, Classification
Loukina, Anastassia; Buzick, Heather – ETS Research Report Series, 2017
This study is an evaluation of the performance of automated speech scoring for speakers with documented or suspected speech impairments. Given that the use of automated scoring of open-ended spoken responses is relatively nascent and there is little research to date that includes test takers with disabilities, this small exploratory study focuses…
Descriptors: Automation, Scoring, Language Tests, Speech Tests
Choi, Ikkyu; Papageorgiou, Spiros – Language Testing, 2020
Stakeholders of language tests are often interested in subscores. However, reporting a subscore is not always justified; a subscore should provide reliable and distinct information to be worth reporting. When a subscore is used for decisions across multiple levels (e.g., individual test takers and schools), it needs to be justified for its…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores
Sawaki, Yasuyo; Sinharay, Sandip – Language Testing, 2018
The present study examined the reliability of the reading, listening, speaking, and writing section scores for the TOEFL iBT® test and their interrelationship in order to collect empirical evidence to support, respectively, the "generalization" inference and the "explanation" inference in the TOEFL iBT validity argument…
Descriptors: English (Second Language), Language Tests, Second Language Learning, Computer Assisted Testing