نتایج جستجو برای: bilingual lexicon
تعداد نتایج: 20633 فیلتر نتایج به سال:
Crosslingual word embeddings represent lexical items from different languages using the same vector space, enabling crosslingual transfer. Most prior work constructs embeddings for a pair of languages, with English on one side. We investigate methods for building high quality crosslingual word embeddings for many languages in a unified vector space. In this way, we can exploit and combine infor...
Alignment from comparable corpora usually involves two languages, one source and one target language. Previous works on bilingual lexicon extraction from parallel corpora demonstrated that more than two languages can be useful to improve the alignments. Our works have investigated to which extent a third language could be interesting to bypass the original alignment. We have defined two origina...
The linguistic features of material in Cultural Heritage (CH) archives may be in various languages requiring a facility for effective multilingual search. The specialised language often associated with CH content introduces problems for automatic translation to support search applications. The MultiMatch project is focused on enabling users to interact with CH content across different media typ...
This paper reports on our participation in the NEWS 2011 shared task on transliteration generation with a syllable-based Backward Maximum Matching system. The system uses the Onset First Principle to syllabify English names and align them with Chinese names. The bilingual lexicon containing aligned segments of various syllable lengths subsequently allows direct transliteration by chunks. The of...
This paper presents a multi-view approach for term translation spotting, based on a bilingual lexicon and comparable corpora. We propose to study different levels of representation for a term: the context, the theme and the orthography. These three approaches are studied individually and combined in order to rank translation candidates. We focus our task on French-English medical terms. Experim...
Automatically generating bilingual dictionaries from parallel, manually translated texts is a well established technique that works well in practice. However, parallel texts are a scarce resource. Therefore, it is desirable also to be able to generate dictionaries from pairs of comparable monolingual corpora. For most languages, such corpora are much easier to acquire, and often in considerably...
Scandinavian languages like Swedish employ tonal accent as a lexical phonological feature, where suprasegmental information can be the sole factor differentiating between words. Using cross-modal semantic fragment priming we tested the following: (a) Do monolingual speakers of Swedish use tonal accent information during lexical access? (b) Do bilingual speakers, who grew up with one tonal (Swed...
We aim to replace the long and complicated, pipeline employed to produce probabilistic phrasal bilexica with a theoretically principled, grammar based, approach. To this end, we introduce a learning regime to learn a phrasal grammar equivalent to linear transduction grammars. The stochastic version of this new grammar type also has the property that the set of biterminals constitute a natural p...
This paper discusses the effect of hubness in zero-shot learning, when ridge regression is used to find a mapping between the example space to the label space. Contrary to the existing approach, which attempts to find a mapping from the example space to the label space, we show that mapping labels into the example space is desirable to suppress the emergence of hubs in the subsequent nearest ne...
نمودار تعداد نتایج جستجو در هر سال
با کلیک روی نمودار نتایج را به سال انتشار فیلتر کنید