ERIC Number: EJ1453645
Record Type: Journal
Publication Date: 2024-Dec
Pages: 27
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-1560-4292
EISSN: EISSN-1560-4306
Crosslingual Content Scoring in Five Languages Using Machine-Translation and Multilingual Transformer Models
Andrea Horbach; Joey Pehlke; Ronja Laarmann-Quante; Yuning Ding
International Journal of Artificial Intelligence in Education, v34 n4 p1294-1320 2024
This paper investigates crosslingual content scoring, a scenario where scoring models trained on learner data in one language are applied to data in a different language. We analyze data in five different languages (Chinese, English, French, German and Spanish) collected for three prompts of the established English ASAP content scoring dataset. We cross the language barrier by means of both shallow and deep learning crosslingual classification models using both machine translation and multilingual transformer models. We find that a combination of machine translation and multilingual models outperforms each method individually -- our best results are reached when combining the available data in different languages, i.e. first training a model on the large English ASAP dataset before fine-tuning on smaller amounts of training data in the target language.
Descriptors: Contrastive Linguistics, Scoring, Learning Analytics, Chinese, English, French, German, Spanish, Second Languages, Computational Linguistics, Classification, Artificial Intelligence, Translation, Computer Software
Springer. Available from: Springer Nature. One New York Plaza, Suite 4600, New York, NY 10004. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-460-1700; e-mail: customerservice@springernature.com; Web site: https://bibliotheek.ehb.be:2123/
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Data File: URL: https://github.com/andreahorbach/CrosslingualScoring/