Utilizing Semantic Equivalence Classes of Japanese Functional Expressions in Translation Rule Acquisition from Parallel Patent Sentences

نویسندگان

  • Taiji Nagasaka
  • Ran Shimanouchi
  • Akiko Sakamoto
  • Takafumi Suzuki
  • Yohei Morishita
  • Takehito Utsuro
  • Suguru Matsuyoshi
چکیده

In the “Sandglass” MT architecture, we identify the class of monosemous Japanese functional expressions and utilize it in the task of translating Japanese functional expressions into English. We employ the semantic equivalence classes of a recently compiled large scale hierarchical lexicon of Japanese functional expressions. We then study whether functional expressions within a class can be translated into a single canonical English expression. Based on the results of identifying monosemous semantic equivalence classes, this paper studies how to extract rules for translating functional expressions in Japanese patent documents into English. In this study, we use about 1.8M Japanese-English parallel sentences automatically extracted from Japanese-English patent families, which are distributed through the Patent Translation Task at the NTCIR-7 Workshop. Then, as a toolkit of a phrase-based SMT (Statistical Machine Translation) model, Moses is applied and Japanese-English translation pairs are obtained in the form of a phrase translation table. Finally, we extract translation pairs of Japanese functional expressions from the phrase translation table. Through this study, we found that most of the semantic equivalence classes judged as monosemous based on manual translation into English have only one translation rules even in the patent domain.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Example-based Translation of Japanese Functional Expressions utilizing Semantic Equivalence Classes

This paper studies issues on machine translation of Japanese functional expressions into English. Unlike our previous works, in order to address the issue of resolving various ambiguities of a compound expression, this paper takes the approach of example-based machine translation. In this approach, a patent translation example database is developed given the phrase translation tables trained wi...

متن کامل

Identifying and Utilizing the Class of Monosemous Japanese Functional Expressions in Machine Translation

In the “Sandglass” machine translation architecture, we identify the class of monosemous Japanese functional expressions and utilize it in the task of translating Japanese functional expressions into English. We employ the semantic equivalence classes of a recently compiled large scale hierarchical lexicon of Japanese functional expressions. We then study whether functional expressions within a...

متن کامل

Japanese-English Translation Through Internal Expressions

This paper describes an approach to Japanese-Englishtranslation through internal expressions which are similar to those used in our recent approach to English-Japanese translation [2]. Attention is focused on construction of ~he internal expressions of Japanese sentences based on case structures of predicates and also conversion of the Japanese internal expressions to the English ones for gener...

متن کامل

Collecting Bilingual Technical Terms from Patent Families of Character-Segmented Chinese Sentences and Morpheme-Segmented Japanese Sentences

In manual translation of patent documents, a technical term bilingual lexicon is inevitable for a translator to efficiently translate patent documents. Dong et al. (2015) proposed a method of generating bilingual technical term lexicon from morpheme-segmented parallel patent sentences. The proposed method estimates Japanese-Chinese translation of technical terms using the phrase translation tab...

متن کامل

Identifying Japanese-Chinese Bilingual Synonymous Technical Terms from Patent Families

In the task of acquiring Japanese-Chinese technical term translation equivalent pairs from parallel patent documents, this paper considers situations where a technical term is observed in many parallel patent sentences and is translated into many translation equivalents and studies the issue of identifying synonymous translation equivalent pairs. First, we collect candidates of synonymous trans...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010