The Reordering Problem in Statistical Machine Translation
نویسنده
چکیده
Reordering of words is one of the most visible changes when translating a sentence from one language to another. The reordering problem has always been a central concern for machine translation. In this report, we look at the reordering problem in the context of Statistical Machine Translation. We first study and classify the reordering divergences between languages. We emphasize the contribution that linguistic understanding of these divergence patterns plays in developing efficient and effective solutions to the reordering problem. We study how the reordering is modeled in the major SMT models and analyze these models with respect to coverage of divergence patterns, model complexity and use of linguistic resources. Machine translation is distinguished from other machine learning problems with respect to the complexity of inference, known as decoding in machine translation terminology. We describe methods used to efficiently perform decoding while attempting to get the best possible reordering. We look at extensions beyond the noisy channel model, in the form of source reordering and discriminative re-ranking, which attempt to further improve reordering. In this report, our endeavour has been to do a holistic study of all aspects of the reordering problem, across all components of an SMT system and identify potential areas of research.
منابع مشابه
Reordering Models for Statistical Machine Translation: A Literature Survey
In this survey, we briefly study various reordering models that are used with statistical translation models. Reordering model is one of the important component of any statistical machine translation system. Problem of reordering is NP-Hard itself. In this survey, we study various reordering approaches that can be used to solve this problem. We first study simple distortion-based reordering whi...
متن کاملA Direct Syntax-Driven Reordering Model for Phrase-Based Machine Translation
This paper presents a direct word reordering model with novel syntax-based features for statistical machine translation. Reordering models address the problem of reordering source language into the word order of the target language. IBM Models 3 through 5 have reordering components that use surface word information but very little context information to determine the traversal order of the sour...
متن کاملAdvancements in Reordering Models for Statistical Machine Translation
In this paper, we propose a novel reordering model based on sequence labeling techniques. Our model converts the reordering problem into a sequence labeling problem, i.e. a tagging task. Results on five Chinese-English NIST tasks show that our model improves the baseline system by 1.32 BLEU and 1.53 TER on average. Results of comparative study with other seven widely used reordering models will...
متن کاملWord-reordering for Statistical Machine Translation Using Trigram Language Model
In this paper we study the word-reordering problem in the decoding part of statistical machine translation, but independently from the target language generating process. In this model, a permuted sentence is given and the goal is to recover the correct order. We introduce a greedy algorithm called Local-(k, l)-Step, and show that it performs better than the DP-based algorithm. Our word-reorder...
متن کاملLSTM Neural Reordering Feature for Statistical Machine Translation
Artificial neural networks are powerful models, which have been widely applied into many aspects of machine translation, such as language modeling and translation modeling. Though notable improvements have been made in these areas, the reordering problem still remains a challenge in statistical machine translations. In this paper, we present a novel neural reordering model that directly models ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012