Ambiguity and Unknown Term Translation in CLIR
نویسندگان
چکیده
In this paper we present a report on our participation in the CLEF Chinese-English ad hoc bilingual track, and we discuss a disambiguation strategy which employs a modified co-occurrence model to determine the most appropriate translation for a given query. This strategy is used alongside a pattern-based translation extraction method which addresses the ‘unknown term’ translation problem. Experimental results demonstrate that a combination of these two techniques substantially improves retrieval effectiveness when compared to various baseline systems that employ basic co-occurrence measures or make no provision for out-of-vocabulary terms.
منابع مشابه
1 On Bidirectional English - Arabic Search
In Cross-Language Information Retrieval (CLIR), queries in one language retrieve relevant documents in other languages. Machine-Readable Dictionaries (MRD) and Machine Translation (MT) systems are important resources for query translation in CLIR. We investigate the use of MT systems and MRD to Arabic-English and English-Arabic CLIR. The translation ambiguity associated with these resources is ...
متن کاملA Probabilistic Translation Method for Dictionary-based Cross-lingual Information Retrieval in Agglutinative Languages
Translation ambiguity, out of vocabulary words and missing some translations in bilingual dictionaries make dictionary-based Crosslanguage Information Retrieval (CLIR) a challenging task. Moreover, in agglutinative languages which do not have reliable stemmers, missing various lexical formations in bilingual dictionaries degrades CLIR performance. This paper aims to introduce a probabilistic tr...
متن کاملSearch-Result-Based Method for Unknown Term Translation in Cross-Language Information Retrieval
In this paper, we adopt two methods targeting NTCIR-5 Chinese-English CLIR task. First, to alleviate problems of unknown query terms, we combine dictionary-based and search-result-based methods to handle query translation for CLIR. Second, to reduce document retrieval time, we use a Chinese part-of-speech (POS) tagger to extract only nouns, verbs, and foreign words as
متن کاملSimple Query Translation Methods for Korean-English and Korean-Chinese CLIR in NTCIR Experiments
in goal of our participation in the NTCIR op is to evaluate relatively simple yet al methods for CLIR using Korean queries lish and Chinese documents. We employed ary-based query translation methods for cases but with different translation ity resolution techniques. The KoreanCLIR was quite successful, but the -Chinese CLIR resulted in unexpectedly rformance. While our analysis is still in s, w...
متن کاملTranslation Events in Cross-language Information Retrieval: Lexical Ambiguity, Lexical Holes, Vocabulary Mismatch, and Correct Translations
Cross-Language Information Retrieval (CLIR) systems enable users to formulate queries in their native language to retrieve documents in foreign languages. Because queries and documents in CLIR do not necessarily share the same language, translation is needed before matching can take place. This translation step tends to cause a reduction in the retrieval performance of CLIR as compared to monol...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007