NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ1420889
Record Type: Journal
Publication Date: 2024-Apr
Pages: 21
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-0267-6583
EISSN: EISSN-1477-0326
The CELI Corpus: Design and Linguistic Annotation of a New Online Learner Corpus
Stefania Spina; Irene Fioravanti; Luciana Forti; Fabio Zanda
Second Language Research, v40 n2 p457-477 2024
This article introduces the CELI corpus, a new learner corpus of written Italian consisting of ca. 600,000 tokens, evenly distributed among CEFR (Common European Framework of Reference for Languages) proficiency levels B1, B2, C1 and C2. The collected texts derive from the language certification exams administered by the University for Foreigners of Perugia all around the world. The corpus contains rich metadata pertaining to text-related and learner-related variables. It expands the domain of learner corpora by being, among other things, both freely available online to the research community, and by focusing on a target language other than English. The article also presents and evaluates the POS-tagging procedure, thus contributing to best practices in learner corpus annotation.
SAGE Publications. 2455 Teller Road, Thousand Oaks, CA 91320. Tel: 800-818-7243; Tel: 805-499-9774; Fax: 800-583-2665; e-mail: journals@sagepub.com; Web site: https://bibliotheek.ehb.be:2993
Publication Type: Journal Articles; Information Analyses; Reports - Descriptive
Education Level: Higher Education; Postsecondary Education
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Identifiers - Location: Italy
Grant or Contract Numbers: N/A