ERIC - Search Results

Publication Date

In 2025	0
Since 2024	1
Since 2021 (last 5 years)	4
Since 2016 (last 10 years)	11
Since 2006 (last 20 years)	18

Descriptor

Item Response Theory	33
English (Second Language)	28
Language Tests	25
Second Language Learning	19
Test Items	14
Scores	11
Foreign Countries	9
Comparative Analysis	7
Models	7
Test Construction	7
Computer Assisted Testing	6
Difficulty Level	6
Reading Comprehension	6
Statistical Analysis	6
Correlation	5
Estimation (Mathematics)	5
Language Proficiency	5
Psychometrics	5
Reading Tests	5
Simulation	5
Test Validity	5
College Students	4
Equated Scores	4
Goodness of Fit	4
Item Analysis	4
More ▼

Source

Language Testing	6
ETS Research Report Series	5
Language Assessment Quarterly	3
Educational and Psychological…	2
Educational Assessment	1
Educational Psychology	1
Educational Technology &…	1
InSight: A Journal of…	1
Language Testing in Asia	1
Psicologica: International…	1
SAGE Open	1
More ▼

Publication Type

Journal Articles	23
Reports - Research	23
Reports - Evaluative	10
Tests/Questionnaires	5

Education Level

Higher Education	5
Postsecondary Education	4
Elementary Education	1
Junior High Schools	1
Middle Schools	1
Secondary Education	1

Audience

Location

Iran	3
Australia	1
France	1
Greece	1
Hong Kong	1
Iran (Tehran)	1
Japan (Tokyo)	1
Kenya	1
Netherlands	1
South Korea	1
United Kingdom	1
United States	1
Vietnam	1
More ▼

Laws, Policies, & Programs

Assessments and Surveys

Test of English as a Foreign…	33
International English…	3
Test of English for…	1

What Works Clearinghouse Rating

Showing 1 to 15 of 33 results Save | Export

Revisiting Raters' Accent Familiarity in Speaking Tests: Evidence That Presentation Mode Interacts with Accent Familiarity to Variably Affect Comprehensibility Ratings

Peer reviewed

Direct link

Michael D. Carey; Stefan Szocs – Language Testing, 2024

This controlled experimental study investigated the interaction of variables associated with rating the pronunciation component of high-stakes English-language-speaking tests such as IELTS and TOEFL iBT. One hundred experienced raters who were all either familiar or unfamiliar with Brazilian-accented English or Papua New Guinean Tok Pisin-accented…

Descriptors: Dialects, Pronunciation, Suprasegmentals, Familiarity

Application of Nonparametric Item Response Theory in Determining the One-Dimensionality and Scalability of TOEFL iBT Listening Test

Peer reviewed

Direct link

Ghaemi, Hamed – Language Testing in Asia, 2022

Listening comprehension in English, as one of the most fundamental skills, has an essential role in the process of learning English. Mokken scale analysis (MSA) is a probabilistic-nonparametric approach to item response theory (IRT) which determines the one-dimensionality and scalability of test. Mokken scaling techniques are a useful tool for…

Descriptors: Second Language Learning, English (Second Language), Nonparametric Statistics, Item Response Theory

The Impact of Using Synthetically Generated Listening Stimuli on Test-Taker Performance: A Case Study with Multiple-Choice, Single-Selection Items. TOEFL® Research Reports. RR-98. ETS?RR-22-05

Peer reviewed
PDF on ERIC

Download full text

Choi, Ikkyu; Zu, Jiyun – ETS Research Report Series, 2022

Synthetically generated speech (SGS) has become an integral part of our oral communication in a wide variety of contexts. It can be generated instantly at a low cost and allows precise control over multiple aspects of output, all of which can be highly appealing to second language (L2) assessment developers who have traditionally relied upon human…

Descriptors: Test Wiseness, Multiple Choice Tests, Test Items, Difficulty Level

For a Greater Good: Bias Analysis in Writing Assessment

Peer reviewed

Direct link

Ahmadi Shirazi, Masoumeh – SAGE Open, 2019

Threats to construct validity should be reduced to a minimum. If true, sources of bias, namely raters, items, tests as well as gender, age, race, language background, culture, and socio-economic status need to be spotted and removed. This study investigates raters' experience, language background, and the choice of essay prompt as potential…

Descriptors: Foreign Countries, Language Tests, Test Bias, Essay Tests

Does EFL Readers' Lexical and Grammatical Knowledge Predict Their Reading Ability? Insights from a Perceptron Artificial Neural Network Study

Peer reviewed

Direct link

Aryadoust, Vahid; Baghaei, Purya – Educational Assessment, 2016

This study aims to examine the relationship between reading comprehension and lexical and grammatical knowledge among English as a foreign language students by using an Artificial Neural Network (ANN). There were 825 test takers administered both a second-language reading test and a set of psychometrically validated grammar and vocabulary tests.…

Descriptors: English (Second Language), Reading Comprehension, Lexicology, Grammar

Retrofitting Diagnostic Classification Models to Responses from IRT-Based Assessment Forms

Peer reviewed

Direct link

Liu, Ren; Huggins-Manley, Anne Corinne; Bulut, Okan – Educational and Psychological Measurement, 2018

Developing a diagnostic tool within the diagnostic measurement framework is the optimal approach to obtain multidimensional and classification-based feedback on examinees. However, end users may seek to obtain diagnostic feedback from existing item responses to assessments that have been designed under either the classical test theory or item…

Descriptors: Models, Item Response Theory, Psychometrics, Test Construction

Evaluating Subscore Uses across Multiple Levels: A Case of Reading and Listening Subscores for Young EFL Learners

Peer reviewed

Direct link

Choi, Ikkyu; Papageorgiou, Spiros – Language Testing, 2020

Stakeholders of language tests are often interested in subscores. However, reporting a subscore is not always justified; a subscore should provide reliable and distinct information to be worth reporting. When a subscore is used for decisions across multiple levels (e.g., individual test takers and schools), it needs to be justified for its…

Descriptors: English (Second Language), Language Tests, Second Language Learning, Scores

Making Better Tests with the Rasch Measurement Model

Peer reviewed
PDF on ERIC

Download full text

Karlin, Omar; Karlin, Sayaka – InSight: A Journal of Scholarly Teaching, 2018

This study had two aims. The first was to explain the process of using the Rasch measurement model to validate tests in an easy-to-understand way for those unfamiliar with the Rasch measurement model. The second was to validate two final exams with several shared items. The exams were given to two groups of students with slightly differing English…

Descriptors: Item Response Theory, Test Validity, Test Items, Accuracy

Towards Improved Assessment of L2 Collocation Knowledge

Peer reviewed

Direct link

Lee, Senyung; Shin, Sun-Young – Language Assessment Quarterly, 2021

Multiple test tasks are available for assessing L2 collocation knowledge. However, few studies have investigated the characteristics of a variety of recognition and recall tasks of collocation simultaneously, and most research on L2 collocations has focused on verb-noun and adjective-noun collocations. This study investigates (1) the relative…

Descriptors: Phrase Structure, Second Language Learning, Language Tests, Recall (Psychology)

Using Multilevel Modeling in Language Assessment Research: A Conceptual Introduction

Peer reviewed

Direct link

Barkaoui, Khaled – Language Assessment Quarterly, 2013

This article critiques traditional single-level statistical approaches (e.g., multiple regression analysis) to examining relationships between language test scores and variables in the assessment setting. It highlights the conceptual, methodological, and statistical problems associated with these techniques in dealing with multilevel or nested…

Descriptors: Hierarchical Linear Modeling, Statistical Analysis, Multiple Regression Analysis, Generalizability Theory

Modeling Local Item Dependence in Cloze and Reading Comprehension Test Items Using Testlet Response Theory

Peer reviewed
PDF on ERIC

Download full text

Baghaei, Purya; Ravand, Hamdollah – Psicologica: International Journal of Methodology and Experimental Psychology, 2016

In this study the magnitudes of local dependence generated by cloze test items and reading comprehension items were compared and their impact on parameter estimates and test precision was investigated. An advanced English as a foreign language reading comprehension test containing three reading passages and a cloze test was analyzed with a…

Descriptors: Cloze Procedure, Reading, Reading Comprehension, Reading Skills

Diagnosing University Students' Academic Writing in English: Is Cognitive Diagnostic Modelling the Way Forward?

Peer reviewed

Direct link

Xie, Qin – Educational Psychology, 2017

The study utilised a fine-grained diagnostic checklist to assess first-year undergraduates in Hong Kong and evaluated its validity and usefulness for diagnosing academic writing in English. Ten English language instructors marked 472 academic essays with the checklist. They also agreed on a Q-matrix, which specified the relationships among the…

Descriptors: Academic Discourse, College Students, College English, Foreign Countries

The Rasch Wars: The Emergence of Rasch Measurement in Language Testing

Peer reviewed

Direct link

McNamara, Tim; Knoch, Ute – Language Testing, 2012

This paper examines the uptake of Rasch measurement in language testing through a consideration of research published in language testing research journals in the period 1984 to 2009. Following the publication of the first papers on this topic, exploring the potential of the simple Rasch model for the analysis of dichotomous language test data, a…

Descriptors: Language Tests, Testing, English (Second Language), Item Response Theory

The Relationship between Raters' Prior Language Study and the Evaluation of Foreign Language Speech Samples. TOEFL iBT® Research Report. TOEFL iBT-16. ETS Research Report RR-11-30

Peer reviewed
PDF on ERIC

Download full text

Winke, Paula; Gass, Susan; Myford, Carol – ETS Research Report Series, 2011

This study investigated whether raters' second language (L2) background and the first language (L1) of test takers taking the TOEFL iBT® Speaking test were related through scoring. After an initial 4-hour training period, a group of 107 raters (mostly of learners of Chinese, Korean, and Spanish), listened to a selection of 432 speech samples that…

Descriptors: Second Language Learning, Evaluators, Speech Tests, English (Second Language)

Assessing the Test Information Function and Differential Item Functioning for the "TOEFL Junior"® Standard Test. Research Report. ETS RR-13-17. "TOEFL Junior"® Research Report. TOEFL JR-01

Peer reviewed
PDF on ERIC

Download full text

Young, John W.; Morgan, Rick; Rybinski, Paul; Steinberg, Jonathan; Wang, Yuan – ETS Research Report Series, 2013

The "TOEFL Junior"® Standard Test is an assessment that measures the degree to which middle school-aged students learning English as a second language have attained proficiency in the academic and social English skills representative of English-medium instructional environments. The assessment measures skills in three areas: listening…

Descriptors: Item Response Theory, Test Items, Language Tests, Second Language Learning

Previous Page | Next Page »

Pages: 1 | 2 | 3

Way, Walter D.	3
Baghaei, Purya	2
Choi, Ikkyu	2
Hicks, Marilyn M.	2
Ahmadi Shirazi, Masoumeh	1
Aryadoust, Vahid	1
Bachman, Lyle F.	1
Barkaoui, Khaled	1
Boldt, R. F.	1
Boldt, Robert F.	1
Breland, Hunter	1
Bulut, Okan	1
Carr, Nathan T.	1
Chang, Hsin-Yi	1
Choi, Inn-Chull	1
Chyn, Susan	1
Gass, Susan	1
Gentile, Claudia	1
Ghaemi, Hamed	1
Hale, Gordon A.	1
Huggins-Manley, Anne Corinne	1
Karlin, Omar	1
Karlin, Sayaka	1
Kim, Hae-Jin	1
More ▼