Visualizing Data Structures in Parsing-Based Machine Translation
نویسندگان
چکیده
As machine translation (MT) systems grow more complex and incorporate more linguistic knowledge, it becomes more difficult to evaluate independent pieces of the MT pipeline. Being able to inspect many of the intermediate data structures used during MT decoding allows a more fine-grained evaluation of MT performance, helping to determine which parts of the current process are effective and which are not. In this article, we present an overview of the visualization tools that are currently distributed with the Joshua (Li et al., 2009) MT decoder. We explain their use and present an example of how visually inspecting the decoder’s data structures has led to useful improvements in the MT model.
منابع مشابه
برچسبزنی خودکار نقشهای معنایی در جملات فارسی به کمک درختهای وابستگی
Automatic identification of words with semantic roles (such as Agent, Patient, Source, etc.) in sentences and attaching correct semantic roles to them, may lead to improvement in many natural language processing tasks including information extraction, question answering, text summarization and machine translation. Semantic role labeling systems usually take advantage of syntactic parsing and th...
متن کاملVisualizing Deep-Syntactic Parser Output
“Deep-syntactic” dependency structures bridge the gap between the surface-syntactic structures as produced by state-of-the-art dependency parsers and semantic logical forms in that they abstract away from surfacesyntactic idiosyncrasies, but still keep the linguistic structure of a sentence. They have thus a great potential for such downstream applications as machine translation and summarizati...
متن کاملSub-Sentence Division for Tree-Based Machine Translation
Tree-based statistical machine translation models have made significant progress in recent years, especially when replacing 1-best trees with packed forests. However, as the parsing accuracy usually goes down dramatically with the increase of sentence length, translating long sentences often takes long time and only produces degenerate translations. We propose a new method named subsentence div...
متن کاملThe MetaMorpho Translation System
In this article, we present MetaMorpho, a rule based machine translation system that was used to create MorphoLogic’s submission to the WMT08 shared Hungarian to English translation task. The architecture of MetaMorpho does not fit easily into traditional categories of rule based systems: the building blocks of its grammar are pairs of rules that describe source and target language structures i...
متن کاملA Hybrid System for Chinese-English Patent Machine Translation
This paper presents a novel hybrid system, which combines rule-based machine translation (RBMT) with phrase-based statistical machine translation (SMT), to translate Chinese patent texts into English. The hybrid architecture is basically guided by the RBMT engine which processes source language parsing and transformation, generating proper syntactic trees for the target language. In the generat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Prague Bull. Math. Linguistics
دوره 93 شماره
صفحات -
تاریخ انتشار 2010