Non-linear n-best List Reranking with Few Features
نویسندگان
چکیده
In Machine Translation, it is customary to compute the model score of a predicted hypothesis as a linear combination of multiple features, where each feature assesses a particular facet of the hypothesis. The choice of a linear combination is usually justified by the possibility of efficient inference (decoding); yet, the appropriateness of this simple combination scheme to the task at hand is rarely questioned. In this paper, we propose an approach that replaces the linear scoring function with a non-linear scoring function. To investigate the applicability of this approach, we rescore n-best lists generated with a conventional machine translation engine (using a linear scoring function for generating its hypotheses) with a non-linear scoring function learned using the learning-to-rank framework. Moderate, though consistent, gains in BLEU are demonstrated on the WMT’10, WMT’11 and WMT’12 test sets.
منابع مشابه
Word Lattice Reranking for Chinese Word Segmentation and Part-of-Speech Tagging
In this paper, we describe a new reranking strategy named word lattice reranking, for the task of joint Chinese word segmentation and part-of-speech (POS) tagging. As a derivation of the forest reranking for parsing (Huang, 2008), this strategy reranks on the pruned word lattice, which potentially contains much more candidates while using less storage, compared with the traditional n-best list ...
متن کاملForest Reranking: Discriminative Parsing with Non-Local Features
Conventional n-best reranking techniques often suffer from the limited scope of the n-best list, which rules out many potentially good alternatives. We instead propose forest reranking, a method that reranks the packed forest of exponentially many parses. Although exact inference is intractable with non-local features, we present an approximation algorithm that makes discriminative training pra...
متن کاملMulti-Pass Decoding With Complex Feature Guidance for Statistical Machine Translation
In Statistical Machine Translation, some complex features are still difficult to integrate during decoding and usually used through the reranking of the k-best hypotheses produced by the decoder. We propose a translation table partitioning method that exploits the result of this reranking to iteratively guide the decoder in order to produce a new k-best list more relevant to some complex featur...
متن کاملImproving Speech Recognizer Performance
This thesis investigates N-best hypotheses reranking techniques for improving speech recognition accuracy. We have focused on improving the accuracy of a speech recognizer used in a dialog system. Our post-processing approach uses a linear regression model to predict the error rate of each hypothesis from hypothesis features, and then outputs the one that has the lowest (recomputed) error rate....
متن کاملReranking Translation Candidates Produced by Several Bilingual Word Similarity Sources
We investigate the reranking of the output of several distributional approaches on the Bilingual Lexicon Induction task. We show that reranking an n-best list produced by any of those approaches leads to very substantial improvements. We further demonstrate that combining several n-best lists by reranking is an effective way of further boosting performance.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012