Continuous Space Reordering Models for Phrase-based MT
نویسندگان
چکیده
Bilingual sequence models improve phrase-based translation and reordering by overcoming phrasal independence assumption and handling long range reordering. However, due to data sparsity, these models often fall back to very small context sizes. This problem has been previously addressed by learning sequences over generalized representations such as POS tags or word clusters. In this paper, we explore an alternative based on neural network models. More concretely we train neuralized versions of lexicalized reordering [1] and the operation sequence models [2] using feed-forward neural network. Our results show improvements of up to 0.6 and 0.5 BLEU points on top of the baseline German→English and English→German systems. We also observed improvements compared to the systems that used POS tags and word clusters to train these models. Because we modify the bilingual corpus to integrate reordering operations, this allows us to also train a sequence-to-sequence neural MT model having explicit reordering triggers. Our motivation was to directly enable reordering information in the encoder-decoder framework, which otherwise relies solely on the attention model to handle long range reordering. We tried both coarser and fine-grained reordering operations. However, these experiments did not yield any improvements over the baseline Neural MT systems.
منابع مشابه
A Neural Reordering Model for Phrase-based Translation
While lexicalized reordering models have been widely used in phrase-based translation systems, they suffer from three drawbacks: context insensitivity, ambiguity, and sparsity. We propose a neural reordering model that conditions reordering probabilities on the words of both the current and previous phrase pairs. Including the words of previous phrase pairs significantly improves context sensit...
متن کاملA joint translation model with integrated reordering
This dissertation aims at combining the benefits and to remedy the flaws of the two popular frameworks in statistical machine translation, namely Phrasebased MT and N-gram-based MT. Phrase-based MT advanced the state-of-the art towards translating phrases3 than words. By memorizing phrases, phrasal MT, is able to learn local reorderings, and handling of other local dependencies such as insertio...
متن کاملLexicalized Reordering for Left-to-Right Hierarchical Phrase-based Translation
Phrase-based and hierarchical phrasebased (Hiero) translation models differ radically in the way reordering is modeled. Lexicalized reordering models play an important role in phrase-based MT and such models have been added to CKY-based decoders for Hiero. Watanabe et al. (2006) propose a promising decoding algorithm for Hiero (LR-Hiero) that visits input spans in arbitrary order and produces t...
متن کاملThe RWTH Aachen German to English MT System for IWSLT 2015
This work describes the statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2015. We participated in the MT and SLT tracks for the German→English language pair. We employ our state-of-the-art phrase-based and hierarchical phrase-based baseline systems for the MT track. ...
متن کاملPhrase reordering for statistical machine translation based on predicate-argument structure
In this paper, we describe a novel phrase reordering model based on predicate-argument structure. Our phrase reordering method utilizes a general predicate-argument structure analyzer to reorder source language chunks based on predicate-argument structure. We explicitly model longdistance phrase alignments by reordering arguments and predicates. The reordering approach is applied as a preproces...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1801.08337 شماره
صفحات -
تاریخ انتشار 2018