Detection of Incorrect Case Assignments in Paraphrase Generation

نویسندگان

  • Atsushi Fujita
  • Kentaro Inui
  • Yuji Matsumoto
چکیده

This paper addresses the issue of post-transfer process in paraphrasing. Our previous investigation into transfer errors revealed that case assignment tends to be incorrect, irrespective of the types of transfer in lexical and structural paraphrasing of Japanese sentences [3]. Motivated by this observation, we propose an empirical method to detect incorrect case assignments. Our error detection model combines two error detection models that are separately trained on a large collection of positive examples and a small collection of manually labeled negative examples. Experimental results show that our combined model significantly enhances the baseline model which is trained only on positive examples. We also propose a selective sampling scheme to reduce the cost of collecting negative examples, and confirm the effectiveness in the error detection task.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Islanding Detection Method of Distributed Generation Based on Wavenet

Due to the increasing need to distributed energy resources in power systems, their problems should be studied. One of the main problem of distributed energy resources is unplanned islanding. The unplanned islanding has some dangers to the power systems and the repairman which are works with the incorrect devices. In this paper, a passive local method is proposed. The proposed method is based on...

متن کامل

Paraphrase and Textual Entailment Generation in Czech

Paraphrase and textual entailment generation can support natural language processing (NLP) tasks that simulate text understanding, e.g., text summarization, plagiarism detection, or question answering. A paraphrase, i.e., a sentence with the same meaning, conveys a certain piece of information with new words and new syntactic structures. Textual entailment, i.e., an inference that humans will j...

متن کامل

Plagiarism Meets Paraphrasing: Insights for the Next Generation in Automatic Plagiarism Detection

Although paraphrasing is the linguistic mechanism underlying many plagiarism cases, little attention has been paid to its analysis in the framework of automatic plagiarism detection. Therefore, state-of-the-art plagiarism detectors find it difficult to detect cases of paraphrase plagiarism. In this article, we analyze the relationship between paraphrasing and plagiarism, paying special attentio...

متن کامل

Detection of Incorrect Case Assignments in Automatically Generated Paraphrases of Japanese Sentences

This paper addresses the issue of correcting transfer errors in paraphrasing. Our previous investigation into transfer errors occurring in lexical and structural paraphrasing of Japanese sentences revealed that case assignment tends to be incorrect, irrespective of the types of transfer (Fujita and Inui, 2003). Motivated by this observation, we propose an empirical method to detect incorrect ca...

متن کامل

New Functions for Unsupervised Asymmetrical Paraphrase Detection

Monolingual text-to-text generation is an emerging research area in Natural Language Processing. One reason for the interest in such generation systems is the possibility to automatically learn text-to-text generation strategies from aligned monolingual corpora. In this context, paraphrase detection can be seen as the task of aligning sentences that convey the same information but yet are writt...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004