Automatic generation of paraphrases to be used as translation references in objective evaluation measures of machine translation
نویسندگان
چکیده
We propose a method that automatically generates paraphrase sets from seed sentences to be used as reference sets in objective machine translation evaluation measures like BLEU and NIST. We measured the quality of the paraphrases produced in an experiment, i.e., (i) their grammaticality: at least 99% correct sentences; (ii) their equivalence in meaning: at least 96% correct paraphrases either by meaning equivalence or entailment; and, (iii) the amount of internal lexical and syntactical variation in a set of paraphrases: slightly superior to that of hand-produced sets. The paraphrase sets produced by this method thus seem adequate as reference sets to be used for MT evaluation.
منابع مشابه
The Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملParaphrasing Headlines by Machine Translation
In this paper we investigate the automatic collection, generation and evaluation of sentential paraphrases. Valuable sources of paraphrases are news article headlines; they tend to describe the same event in various different ways, and can easily be obtained from the web. We describe a method for generating paraphrases by using a large aligned monolingual corpus of news headlines acquired autom...
متن کاملManual and Automatic Paraphrases for MT Evaluation
Paraphrasing of reference translations has been shown to improve the correlation with human judgements in automatic evaluation of machine translation (MT) outputs. In this work, we present a new dataset for evaluating English-Czech translation based on automatic paraphrases. We compare this dataset with an existing set of manually created paraphrases and find that even automatic paraphrases can...
متن کاملCreating and using large monolingual parallel corpora for sentential paraphrase generation
In this paper we investigate the automatic generation of paraphrases by using machine translation techniques. Three contributions we make are the construction of a large paraphrase corpus for English and Dutch, a re-ranking heuristic to use machine translation for paraphrase generation and a proper evaluation methodology. A large parallel corpus is constructed by aligning clustered headlines th...
متن کاملParaphrasing and Translation
Usefulness of paraphrases • Paraphrases are alternative ways of conveying the same information • Useful in NLP application such as: – Generation producing paraphrases allows for the creation of more varied and fluent text – Multidocument summarization identifying paraphrases allows information repeated across documents to be condensed – Question answering paraphrasing is important when going be...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005