Neural Machine Transliteration: Preliminary Results
نویسنده
چکیده
Machine transliteration is the process of automatically transforming the script of a word from a source language to a target language, while preserving pronunciation. Sequence to sequence learning has recently emerged as a new paradigm in supervised learning. In this paper a character-based encoder-decoder model has been proposed that consists of two Recurrent Neural Networks. The encoder is a Bidirectional recurrent neural network that encodes a sequence of symbols into a fixed-length vector representation, and the decoder generates the target sequence using an attention-based recurrent neural network. The encoder, the decoder and the attention mechanism are jointly trained to maximize the conditional probability of a target sequence given a source sequence. Our experiments on different datasets show that the proposed encoderdecoder model is able to achieve significantly higher transliteration quality over traditional statistical models.
منابع مشابه
Sequence-to-sequence neural network models for transliteration
Transliteration is a key component of machine translation systems and software internationalization. This paper demonstrates that neural sequence-to-sequence models obtain state of the art or close to state of the art results on existing datasets. In an effort to make machine transliteration accessible, we open source a new Arabic to English transliteration dataset and our trained models.
متن کاملApplying Neural Networks to English-Chinese Named Entity Transliteration
This paper presents the machine transliteration systems that we employ for our participation in the NEWS 2016 machine transliteration shared task. Based on the prevalent deep learning models developed for general sequence processing tasks, we use convolutional neural networks to extract character level information from the transliteration units and stack a simple recurrent neural network on top...
متن کاملThe AFRL-MITLL WMT16 News-Translation Task System: We put NMT in your MT Rescoring so you can MT while you MT
This paper describes the AFRL-MITLL statistical machine translation systems and the improvements that were developed during the WMT16 evaluation campaign. As part of these efforts we’ve adapted a variety new techniques to our previous years’ systems including Neural Machine Translation, additional out-of-vocabulary transliteration techniques, and morphology generation. Preliminary results are d...
متن کاملEnglish-Hindi Transliteration Using Context-Informed PB-SMT: the DCU System for NEWS 2009
This paper presents English—Hindi transliteration in the NEWS 2009 Machine Transliteration Shared Task adding source context modeling into state-of-the-art log-linear phrase-based statistical machine translation (PB-SMT). Source context features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. We use a memory-based classification framew...
متن کاملHindi Transliteration Using Context - Informed PB - SMT : the DCU System for NEWS 2009
This paper presents English—Hindi transliteration in the NEWS 2009 Machine Transliteration Shared Task adding source context modeling into state-of-the-art log-linear phrase-based statistical machine translation (PB-SMT). Source context features enable us to exploit source similarity in addition to target similarity, as modelled by the language model. We use a memory-based classification framew...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1609.04253 شماره
صفحات -
تاریخ انتشار 2016