A comparative evaluation of modern English corpus grammatical annotation schemes
نویسندگان
چکیده
منابع مشابه
Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English
We describe the NUS Corpus of Learner English (NUCLE), a large, fully annotated corpus of learner English that is freely available for research purposes. The goal of the corpus is to provide a large data resource for the development and evaluation of grammatical error correction systems. Although NUCLE has been available for almost two years, there has been no reference paper that describes the...
متن کاملBriefly Noted English for the Computer: The SUSANNE Corpus and Analytic Scheme
Over the past 10–20 years, there has been increasing interest in grammatical / syntactic annotation schemes for corpora. Annotated corpora are essential for training and testing taggers and parsers, for describing the use of lexical and grammatical features, and for comprehensive analyses of registers or sublanguages. Several annotation schemes have been developed over this period, including bo...
متن کاملLost in Grammar Translation Lost in Grammar Translation
1http://www.di.unito.it/∼tutreeb/ Italian Treebank2 (VIT), and the ISST3. None of them is comparable in size with the English Penn Treebank. This limits the possibility to have reliable induced grammars for Italian. Initial studies have shown that probabilistic grammars induced on a small corpus have not impressive performances [5]. Building larger corpora is then needed. We have been working o...
متن کاملA Multilingual Parallel Parsed Corpus as Gold Standard for Grammatical Inference Evaluation
In this article we investigate how (computational) grammar inference systems are evaluated and how the evaluation procedure can be improved. First, we describe the currently used evaluation methods and look at the advantages and disadvantages of each method. The main problems of the methods are: the dependency on language experts, the influence of the annotation scheme of language data, and the...
متن کاملGrammatical Error Annotation for Korean Learners of Spoken English
The goal of our research is to build a grammatical error-tagged corpus for Korean learners of Spoken English dubbed Postech Learner Corpus. We collected raw story-telling speech from Korean university students. Transcription and annotation using the Cambridge Learner Corpus tagset were performed by six Korean annotators fluent in English. For the annotation of the corpus, we developed an annota...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000