A comparative evaluation of modern English corpus grammatical annotation schemes

نویسندگان

  • Eric Atwell
  • George Demetriou
  • John Hughes
  • Amanda Schiffrin
  • Clive Souter
  • Sean Wilcock
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building a Large Annotated Corpus of Learner English: The NUS Corpus of Learner English

We describe the NUS Corpus of Learner English (NUCLE), a large, fully annotated corpus of learner English that is freely available for research purposes. The goal of the corpus is to provide a large data resource for the development and evaluation of grammatical error correction systems. Although NUCLE has been available for almost two years, there has been no reference paper that describes the...

متن کامل

Briefly Noted English for the Computer: The SUSANNE Corpus and Analytic Scheme

Over the past 10–20 years, there has been increasing interest in grammatical / syntactic annotation schemes for corpora. Annotated corpora are essential for training and testing taggers and parsers, for describing the use of lexical and grammatical features, and for comprehensive analyses of registers or sublanguages. Several annotation schemes have been developed over this period, including bo...

متن کامل

Lost in Grammar Translation Lost in Grammar Translation

1http://www.di.unito.it/∼tutreeb/ Italian Treebank2 (VIT), and the ISST3. None of them is comparable in size with the English Penn Treebank. This limits the possibility to have reliable induced grammars for Italian. Initial studies have shown that probabilistic grammars induced on a small corpus have not impressive performances [5]. Building larger corpora is then needed. We have been working o...

متن کامل

A Multilingual Parallel Parsed Corpus as Gold Standard for Grammatical Inference Evaluation

In this article we investigate how (computational) grammar inference systems are evaluated and how the evaluation procedure can be improved. First, we describe the currently used evaluation methods and look at the advantages and disadvantages of each method. The main problems of the methods are: the dependency on language experts, the influence of the annotation scheme of language data, and the...

متن کامل

Grammatical Error Annotation for Korean Learners of Spoken English

The goal of our research is to build a grammatical error-tagged corpus for Korean learners of Spoken English dubbed Postech Learner Corpus. We collected raw story-telling speech from Korean university students. Transcription and annotation using the Cambridge Learner Corpus tagset were performed by six Korean annotators fluent in English. For the annotation of the corpus, we developed an annota...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000