Word Graphs for Statistical Machine Translation

نویسندگان

  • Richard Zens
  • Hermann Ney
چکیده

Word graphs have various applications in the field of machine translation. Therefore it is important for machine translation systems to produce compact word graphs of high quality. We will describe the generation of word graphs for state of the art phrase-based statistical machine translation. We will use these word graph to provide an analysis of the search process. We will evaluate the quality of the word graphs using the well-known graph word error rate. Additionally, we introduce the two novel graph-to-string criteria: the position-independent graph word error rate and the graph BLEU score. Experimental results are presented for two Chinese–English tasks: the small IWSLT task and the NIST large data track task. For both tasks, we achieve significant reductions of the graph error rate already with compact word graphs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

Lattice Parsing to Integrate Speech Recognition and Rule-Based Machine Translation

In this paper, we present a novel approach to integrate speech recognition and rulebased machine translation by lattice parsing. The presented approach is hybrid in two senses. First, it combines structural and statistical methods for language modeling task. Second, it employs a chart parser which utilizes manually created syntax rules in addition to scores obtained after statistical processing...

متن کامل

Statistical machine translation with cascaded probabilistic transducers

Statistical machine translation is based on the idea to extract information from bilingual corpora, which can be used to generate new translations. The current work combines aspects from example-based machine translation and from grammar-based approaches, esp. bilingual regular grammars, to develop a statistical translation system based on cascaded transducers. These transducers can be construc...

متن کامل

Efficient Phrase-Table Representation for Machine Translation with Applications to Online MT and Speech Translation

In phrase-based statistical machine translation, the phrase-table requires a large amount of memory. We will present an efficient representation with two key properties: on-demand loading and a prefix tree structure for the source phrases. We will show that this representation scales well to large data tasks and that we are able to store hundreds of millions of phrase pairs in the phrase-table....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005