Developing Learner Corpus Annotation for Korean Particle Errors
نویسندگان
چکیده
We aim to sufficiently define annotation for post-positional particle errors in L2 Korean writing, so that future work on automatic particle error detection can make progress. To achieve this goal, we outline the linguistic properties of Korean particles in learner data. Given the agglutinative nature of Korean and the range of functions of particles, this annotation effort involves issues such as defining the tokens and target forms.
منابع مشابه
Grammatical Error Annotation for Korean Learners of Spoken English
The goal of our research is to build a grammatical error-tagged corpus for Korean learners of Spoken English dubbed Postech Learner Corpus. We collected raw story-telling speech from Korean university students. Transcription and annotation using the Cambridge Learner Corpus tagset were performed by six Korean annotators fluent in English. For the annotation of the corpus, we developed an annota...
متن کاملBuilding a Korean Web Corpus for Analyzing Learner Language
Post-positional particles are a significant source of errors for learners of Korean. Following methodology that has proven effective in handling English preposition errors, we are beginning the process of building a machine learner for particle error detection in L2 Korean writing. As a first step, however, we must acquire data, and thus we present a methodology for constructing large-scale cor...
متن کاملDeveloping an Annotation Scheme for ELL Spelling Errors
This paper describes an XML annotation scheme for English Language Learner (ELL) spelling errors in learner corpora which can be used to create standardized, annotated ELL error corpora for use by researchers who are developing spelling correction tools for ELLs. We also provide an error taxonomy (with examples of each error type) upon which the scheme was based.
متن کاملError-Tagged Learner Corpus of Czech
The paper describes a learner corpus of Czech, currently under development. The corpus captures Czech as used by nonnative speakers. We discuss its structure, the layered annotation of errors and the annotation process.
متن کاملAnnotating Errors in a Hungarian Learner Corpus
We are developing and annotating a learner corpus of Hungarian, composed of student journals from three different proficiency levels written at Indiana University. Our annotation marks learner errors that are of different linguistic categories, including phonology, morphology, and syntax, but defining the annotation for an agglutinative language presents several issues. First, we must adapt an ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012