نتایج جستجو برای: bilingual lexicon

تعداد نتایج: 20633  

2010
Adrien Lardilleux Julien Gosme Yves Lepage

In this paper, we present a simple protocol to evaluate word aligners on bilingual lexicon induction tasks from parallel corpora. Rather than resorting to gold standards, it relies on a comparison of the outputs of word aligners against a reference bilingual lexicon. The quality of this reference bilingual lexicon does not need to be particularly high, because evaluation quality is ensured by s...

2007
D. Hiemstra

In this paper we describe a systematic approach to derive a bilingual lexicon automatically from paral lel corpora Following this approach a lexicon was derived from the English and Dutch version of the Agenda corpus With the lexicon and a part of the corpus that was not used to derive the lexicon a bilingual retrieval environment was build Recall and precision of monolingual Dutch retrieval wa...

2002
Fatiha Sadat Hervé Déjean Éric Gaussier

In this paper we present a method to extract bilingual terminologies from comparable non-aligned corpora, by using multiple linguistic knowledge sources, such as: non-parallel corpora, bilingual thesauri, a preliminary bilingual dictionary, etc... We focus on two core technologies: bilingual lexicon extraction from comparable corpora and expansion through thesauri categories based on different ...

2013
John Richardson Toshiaki Nakazawa Sadao Kurohashi

We present a high-precision, languageindependent transliteration framework applicable to bilingual lexicon extraction. Our approach is to employ a bilingual topic model to enhance the output of a state-of-the-art graphemebased transliteration baseline. We demonstrate that this method is able to extract a high-quality bilingual lexicon from a comparable corpus, and we extend the topic model to p...

2012
Rahma Sellami Fatiha Sadat Lamia Hadrich Belguith

________________________________________________________________________________________________________ Bilingual lexicon extraction from Wikipedia With the increased interest of the machine translation, needs of multilingual resources such as comparable corpora and bilingual lexicon has increased. These resources are not available mainly for pair of languages that do not involve English. This...

2016
Pankaj Singh Ashish Kulkarni Himanshu Ojha Vishwajeet Kumar Ganesh Ramakrishnan

Statistical machine translation models are known to benefit from the availability of a domain bilingual lexicon. Bilingual lexicons are traditionally comprised of multiword expressions, either extracted from parallel corpora or manually curated. We claim that “patterns”, comprised of words and higher order categories, generalize better in capturing the syntax and semantics of the domain. In thi...

2013
Itsuki Toyota Zi Long Lijuan Dong Takehito Utsuro Mikio Yamamoto

In the previous methods of generating bilingual lexicon from parallel patent sentences extracted from patent families, the portion from which parallel patent sentences are extracted is about 30% out of the whole “Background” and “Embodiment” parts and about 70% are not used. Considering this situation, this paper proposes to generate bilingual lexicon for technical terms not only from the 30% b...

2004
Magnus Sahlgren

This paper presents a very simple and effective approach to automatic bilingual lexicon acquisition. The approach is cooccurrence-based, and uses the Random Indexing vector space methodology applied to aligned bilingual data. The approach is simple, efficient and scalable, and generate promising results when compared to a manually compiled lexicon. The paper also discusses some of the methodolo...

2004
Hiroyuki Kaji

An improved method for extracting translation equivalents from bilingual comparable corpora according to contextual similarity was developed. This method has two main features. First, a seed bilingual lexicon—which is used to bridge contexts in different languages—is adapted to the corpora from which translation equivalents are to be extracted. Second, the contextual similarity is evaluated by ...

2004
Mathieu Lafourcade Frédéric Rodrigo

This paper assess the possibilities of constructing a multilingual lexicon by propagating conceptual vectors through several monolingual and bilingual resources. The system is based on a vector model in order to learn meanings to potentially select and classify meanings. Bilingual resources ensure the possibility to project vectors on the target lexicon and semantic space.

نمودار تعداد نتایج جستجو در هر سال

با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید