ERIC Number: EJ893640
Record Type: Journal
Publication Date: 2010-Jun
Pages: 25
Abstractor: As Provided
ISBN: N/A
ISSN: ISSN-0305-0009
EISSN: N/A
Available Date: N/A
Morphosyntactic Annotation of CHILDES Transcripts
Sagae, Kenji; Davis, Eric; Lavie, Alon; MacWhinney, Brian; Wintner, Shuly
Journal of Child Language, v37 n3 spec iss p705-729 Jun 2010
Corpora of child language are essential for research in child language acquisition and psycholinguistics. Linguistic annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage. We describe a project whose goal is to annotate the English section of the CHILDES database with grammatical relations in the form of labeled dependency structures. We have produced a corpus of over 18,800 utterances (approximately 65,000 words) with manually curated gold-standard grammatical relation annotations. Using this corpus, we have developed a highly accurate data-driven parser for the English CHILDES data, which we used to automatically annotate the remainder of the English section of CHILDES. We have also extended the parser to Spanish, and are currently working on supporting more languages. The parser and the manually and automatically annotated data are freely available for research purposes.
Descriptors: Psycholinguistics, Grammar, Child Language, Language Acquisition, Databases, Computational Linguistics, Language Usage, Spanish, English, Second Languages, Language Research
Cambridge University Press. The Edinburgh Building, Shaftesbury Road, Cambridge, CB2 8RU, UK. Tel: 800-872-7423; Tel: 845-353-7500; Tel: +44-1223-326070; Fax: 845-353-4141; Fax: +44-1223-325150; e-mail: subscriptions_newyork@cambridge.org; Web site: http://www.cambridge.org
Publication Type: Journal Articles; Reports - Descriptive
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A
Author Affiliations: N/A