Automatic extraction of translations from web-based bilingual materials
نویسندگان
چکیده
منابع مشابه
Automatic Bilingual Phrase Extraction from Comparable Corpora
In this work we present an approach for extracting parallel phrases from comparable news articles to improve statistical machine translation. This is particularly useful for under-resourced languages where parallel corpora are not readily available. Our approach consists of a phrase pair generator that automatically generates candidate parallel phrases and a binary SVM classifier that classifie...
متن کاملAutomatic Extraction of Medical Term Variants from Mutilingual Parallel Translations
The extraction of terms and their variants is an important issue in various applications of natural language processing (NLP), such as question answering and information retrieval. In this chapter we discuss a method to automatically extract medical terms and their variants from a multilingual corpus of parallel translations. As a first step terms are extracted using a pattern-based approach. I...
متن کاملUsing Bilingual Web Data to Mine and Rank Translations
English Reading Wizard Full machine translation has made substantial achievements, but its quality hasn’t reached a satisfactory level. Figure 1 shows such a system’s Chinese-to-English translation. English speakers can get a rough sense of what the original Chinese text describes, but they’ll probably have difficulties understanding the details. (For an example machine translation system, see ...
متن کاملAutomatic Extraction of Information from the Web
The semantic Web will bring meaning to the Internet, making it possible for web agents to understand the information it contains. However, current trends seem to suggest that the semantic web is not likely to be adopted in the forthcoming years. In this sense, meaningful information extraction from the web becomes a handicap for web agents. In this article, we present a framework for automatic ...
متن کاملAutomatic Extraction of Knowledge from Web Documents
A large amount of digital information available is written as text documents in the form of web pages, reports, papers, emails, etc. Extracting the knowledge of interest from such documents from multiple sources in a timely fashion is therefore crucial. This paper provides an update on the Artequakt system which uses natural language tools to automatically extract knowledge about artists from m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Translation
سال: 2007
ISSN: 0922-6567,1573-0573
DOI: 10.1007/s10590-008-9040-7