ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	12
Since 2016 (last 10 years)	41
Since 2006 (last 20 years)	62

Descriptor

Comparative Analysis	111
Language Tests	111
English (Second Language)	66
Test Reliability	61
Second Language Learning	59
Foreign Countries	49
Language Proficiency	46
Test Validity	40
Second Language Instruction	37
Interrater Reliability	31
Scores	24
Reliability	23
College Students	20
Test Construction	20
Correlation	19
Oral Language	18
Scoring	18
Testing	16
Computer Assisted Testing	14
Interviews	14
Test Format	14
Test Items	14
Evaluators	13
Cloze Procedure	12
Higher Education	12
More ▼

Publication Type

Reports - Research	83
Journal Articles	73
Speeches/Meeting Papers	18
Reports - Evaluative	13
Tests/Questionnaires	11
Reports - Descriptive	6
Information Analyses	4
Books	2
Collected Works - General	2
Guides - Non-Classroom	2
Book/Product Reviews	1
Collected Works - Serials	1
Dissertations/Theses -…	1
Dissertations/Theses -…	1
Non-Print Media	1
Opinion Papers	1
Reference Materials - General	1
More ▼

Education Level

Higher Education	23
Postsecondary Education	20
Elementary Education	8
Secondary Education	8
High Schools	5
Grade 6	2
Intermediate Grades	2
Adult Education	1
Grade 10	1
Grade 11	1
Grade 12	1
Grade 7	1
Grade 8	1
Grade 9	1
Junior High Schools	1
Kindergarten	1
Middle Schools	1
More ▼

Audience

Practitioners	4
Teachers	4
Administrators	1
Researchers	1

Location

Iran	8
Turkey	5
China	4
Australia	2
Israel	2
Japan	2
Sweden	2
Taiwan	2
Belgium	1
Cyprus	1
Denmark	1
Europe	1
Germany	1
Hong Kong	1
Hungary	1
Indonesia	1
New Zealand	1
North Carolina	1
Pakistan	1
Saudi Arabia	1
Thailand	1
Thailand (Bangkok)	1
United Kingdom	1
United Kingdom (Great Britain)	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	10
International English…	3
Peabody Picture Vocabulary…	3
ACTFL Oral Proficiency…	2
SAT (College Admission Test)	2
Autism Diagnostic Observation…	1
Clinical Evaluation of…	1
English Proficiency Test	1
Graduate Management Admission…	1
Graduate Record Examinations	1
Kaufman Assessment Battery…	1
Michigan Test of English…	1
Reynell Developmental…	1
Test of Language Development	1
Wechsler Adult Intelligence…	1
Wechsler Intelligence Scale…	1
Woodcock Johnson Tests of…	1
More ▼

What Works Clearinghouse Rating

Showing 1 to 15 of 111 results Save | Export

The Intersection of AI and Language Assessment: A Study on the Reliability of ChatGPT in Grading IELTS Writing Task 2

Peer reviewed
PDF on ERIC

Download full text

Osama Koraishi – Language Teaching Research Quarterly, 2024

This study conducts a comprehensive quantitative evaluation of OpenAI's language model, ChatGPT 4, for grading Task 2 writing of the IELTS exam. The objective is to assess the alignment between ChatGPT's grading and that of official human raters. The analysis encompassed a multifaceted approach, including a comparison of means and reliability…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Artificial Intelligence

Crowdsourced Adaptive Comparative Judgment: A Community-Based Solution for Proficiency Rating

Peer reviewed

Direct link

Paquot, Magali; Rubin, Rachel; Vandeweerd, Nathan – Language Learning, 2022

The main objective of this Methods Showcase Article is to show how the technique of adaptive comparative judgment, coupled with a crowdsourcing approach, can offer practical solutions to reliability issues as well as to address the time and cost difficulties associated with a text-based approach to proficiency assessment in L2 research. We…

Descriptors: Comparative Analysis, Decision Making, Language Proficiency, Reliability

Measuring Language Ability of Students with Compensatory Multidimensional CAT: A Post-Hoc Simulation Study

Peer reviewed

Direct link

Ozdemir, Burhanettin; Gelbal, Selahattin – Education and Information Technologies, 2022

The computerized adaptive tests (CAT) apply an adaptive process in which the items are tailored to individuals' ability scores. The multidimensional CAT (MCAT) designs differ in terms of different item selection, ability estimation, and termination methods being used. This study aims at investigating the performance of the MCAT designs used to…

Descriptors: Scores, Computer Assisted Testing, Test Items, Language Proficiency

The Effects of Multimodal Teaching on English Vocabulary Knowledge of Thai Primary School Students

Peer reviewed
PDF on ERIC

Download full text

Kasikarn Bansong; Somkiet Poopatwiboon; Apisak Sukying – Journal of Education and Learning, 2023

It is increasingly prevalent in digital learning and teaching strategies for discerning a global perspective on creating the student learning experience. Multimodality is an emergent phenomenon that may influence how digital learning is designed, especially during the COVID-19 pandemic in which immersive learning environments, such as a virtual…

Descriptors: Elementary School Students, English (Second Language), Second Language Learning, Second Language Instruction

Applying Generalizability Theory in Language Testing: Comparing Nested and Crossed Scoring Designs in the Assessment of Speaking Skills

Peer reviewed
PDF on ERIC

Download full text

Polat, Murat; Turhan, Nihan Sölpük – International Journal of Curriculum and Instruction, 2021

Scoring language learners' speaking skills is open to a number of measurement errors since raters' personal judgements could involve in the process. Different grading designs in which raters score a student's whole speaking skills or a specific dimension of the speaking performance could be settled to control and minimize the amount of the error…

Descriptors: Language Tests, Scoring, Speech Communication, State Universities

Monitoring the Performance of Human and Automated Scores for Spoken Responses

Peer reviewed

Direct link

Wang, Zhen; Zechner, Klaus; Sun, Yu – Language Testing, 2018

As automated scoring systems for spoken responses are increasingly used in language assessments, testing organizations need to analyze their performance, as compared to human raters, across several dimensions, for example, on individual items or based on subgroups of test takers. In addition, there is a need in testing organizations to establish…

Descriptors: Automation, Scoring, Speech Tests, Language Tests

Investigating the Impact of Rater Training on Rater Errors in the Process of Assessing Writing Skill

Peer reviewed
PDF on ERIC

Download full text

Sata, Mehmet; Karakaya, Ismail – International Journal of Assessment Tools in Education, 2022

In the process of measuring and assessing high-level cognitive skills, interference of rater errors in measurements brings about a constant concern and low objectivity. The main purpose of this study was to investigate the impact of rater training on rater errors in the process of assessing individual performance. The study was conducted with a…

Descriptors: Evaluators, Training, Comparative Analysis, Academic Language

Assessment by Comparative Judgement: An Application to Secondary Statistics and English in New Zealand

Peer reviewed

Direct link

Marshall, Neil; Shaw, Kirsten; Hunter, Jodie; Jones, Ian – New Zealand Journal of Educational Studies, 2020

There is growing interest in using comparative judgement to assess student work as an alternative to traditional marking. Comparative judgement requires no rubrics and is instead grounded in experts making pairwise judgements about the relative 'quality' of students' work according to a high level criterion. The resulting decision data are fitted…

Descriptors: Comparative Analysis, Decision Making, Student Evaluation, Evaluation Methods

Calibrated Parsing Items Evaluation: A Step towards Objectifying the Translation Assessment

Peer reviewed

Direct link

Akbari, Alireza; Shahnazari, Mohammadtaghi – Language Testing in Asia, 2019

The present research paper introduces a translation evaluation method called Calibrated Parsing Items Evaluation (CPIE hereafter). This evaluation method maximizes translators' performance through identifying the parsing items with an optimal p-docimology and d-index (item discrimination). This method checks all the possible parses (annotations)…

Descriptors: Test Items, Translation, Computer Software, Evaluators

Is Putting SUGAR (Sampling Utterances of Grammatical Analysis Revised) into Language Sample Analysis a Good Thing? A Response to Pavelko and Owens (2017)

Peer reviewed

Direct link

Guo, Ling-Yu; Eisenberg, Sarita; Bernstein Ratner, Nan; MacWhinney, Brian – Language, Speech, and Hearing Services in Schools, 2018

Purpose: In this letter, the authors respond to Pavelko and Owens' (2017) newly advanced set of procedures for language sample analysis: Sampling Utterances and Grammatical Analysis Revised (SUGAR). Method: The authors contrast some of the new guidelines for transcription, morpheme segmentation, and language sample elicitation in SUGAR with…

Descriptors: Sampling, Grammar, Transcripts (Written Records), Morphemes

Measuring the Development of General Language Skills in English as a Foreign Language--Longitudinal Invariance of the C-Test

Peer reviewed

Direct link

Schnoor, Birger; Hartig, Johannes; Klinger, Thorsten; Naumann, Alexander; Usanova, Irina – Language Testing, 2023

Research on assessing English as a foreign language (EFL) development has been growing recently. However, empirical evidence from longitudinal analyses based on substantial samples is still needed. In such settings, tests for measuring language development must meet high standards of test quality such as validity, reliability, and objectivity, as…

Descriptors: English (Second Language), Second Language Learning, Second Language Instruction, Longitudinal Studies

Assessing L2 English Speaking Using Automated Scoring Technology: Examining Automarker Reliability

Peer reviewed

Direct link

Xu, Jing; Jones, Edmund; Laxton, Victoria; Galaczi, Evelina – Assessment in Education: Principles, Policy & Practice, 2021

Recent advances in machine learning have made automated scoring of learner speech widespread, and yet validation research that provides support for applying automated scoring technology to assessment is still in its infancy. Both the educational measurement and language assessment communities have called for greater transparency in describing…

Descriptors: Second Language Learning, Second Language Instruction, English (Second Language), Computer Software

Can Dynamic Assessment Identify Language Disorder in Multilingual Children? Clinical Applications from a Systematic Review

Peer reviewed

Direct link

Hunt, Emily; Nang, Charn; Meldrum, Suzanne; Armstrong, Elizabeth – Language, Speech, and Hearing Services in Schools, 2022

Purpose: Multilingual children are disproportionately represented on speech pathology caseloads, in part due to the limited ability of traditional language assessments to accurately capture multilingual children's language abilities. This systematic review evaluates the evidence for identification of language disorder in multilingual children…

Descriptors: Multilingualism, Speech Language Pathology, Language Tests, Diagnostic Tests

Analysis of IELTS and TOEFL Reading and Listening Tests in Terms of Revised Bloom's Taxonomy

Peer reviewed

Direct link

Baghaei, Samira; Bagheri, Mohammad Sadegh; Yamini, Mortaza – Cogent Education, 2020

The main purpose of this quantitative-qualitative content analysis study was to compare IELTS and TOEFL listening and reading tests based on the representation of the learning objectives of Revised Bloom's taxonomy. To this end, 12 Academic IELTS listening and reading tests and 12 TOEFL iBT listening and reading tests were analyzed qualitatively…

Descriptors: Second Language Learning, English (Second Language), Language Tests, Reading Tests

Development of the English Listening and Reading Computerized Revised Token Test into Cantonese: Validity, Reliability, and Sensitivity/Specificity in People with Aphasia and Healthy Controls

Peer reviewed

Direct link

Bakhtiar, Mehdi; Wong, Min Ney; Tsui, Emily Ka Yin; McNeil, Malcolm R. – Journal of Speech, Language, and Hearing Research, 2020

Purpose: This study reports the psychometric development of the Cantonese versions of the English Computerized Revised Token Test (CRTT) for persons with aphasia (PWAs) and healthy controls (HCs). Method: The English CRTT was translated into standard Chinese for the Reading--Word Fade version (CRTT-R-[subscript WF]-Cantonese) and into formal…

Descriptors: Psychometrics, Sino Tibetan Languages, Computer Assisted Testing, Aphasia

Previous Page | Next Page »

Pages: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8

Language Testing	15
ETS Research Report Series	3
Language Assessment Quarterly	3
Assessment in Education:…	2
English Language Teaching	2
Journal of Language and…	2
Journal of Speech, Language,…	2
Language Learning	2
Language, Speech, and Hearing…	2
System	2
TESOL International Journal	2
American Journal of…	1
Applied Linguistics	1
Canadian Journal of School…	1
Canadian Modern Language…	1
Child Language Teaching and…	1
Cogent Education	1
College Board	1
College Entrance Examination…	1
Cross Currents	1
ELT Journal	1
Edinburgh Working Papers in…	1
Education and Information…	1
Education and Training in…	1
European Early Childhood…	1
More ▼

Stansfield, Charles W.	3
Attali, Yigal	2
Brown, James Dean	2
Henning, Grant	2
Kenyon, Dorry	2
Kunnan, Antony John	2
Nakamura, Yuji	2
Takala, Sauli	2
Winke, Paula	2
Adams, R. J.	1
Ahmadi Shirazi, Masoumeh	1
Ahmadi, Alireza	1
Ahn, Jieun Irene	1
Ahour, Touran	1
Akbari, Alireza	1
Alderson, J. Charles, Ed.	1
Alharthi, Saleh	1
Apisak Sukying	1
Arani, Davood Khedmatkar	1
Armstrong, Elizabeth	1
Arth, Thomas O.	1
August, Diane	1
Azizi, Aliye	1
Baghaei, Purya	1
More ▼