Integration of PLSA into Probabilistic CLIR Model - Yokohama National University at NTCIR4 CLIR

نویسندگان

Tetsu Muramatsu

Tatsunori Mori

چکیده

In this paper, we propose a method of CrossLanguage Information Retrieval based on an integration of a probabilistic CLIR model and Probabilistic Latent Semantic Analysis (PLSA). PLSA is adopted to extract the information of translation probability from a parallel corpus. The information is utilized in a probabilistic CLIR model. Although the probabilistic CLIR model with PLSA is quite effective, it takes very long time in the processing. We therefore introduce an approximation method based on a two-phased retrieval model in order to reduce the computational cost. Using the model, we submitted runs for Japaneseto-English bilingual retrieval in CLIR task of NTCIR4.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of Chicago at NTCIR4 CLIR: Multi-Scale Query Expansion

Pseudo-relevance feedback, while useful in monolingual applications for refining and enriching short user queries, proves even more important in crosslanguage information retrieval (CLIR). For CLIR, query expansion before and after translation can provide an opportunity to recover from translation gaps, reduce ambiguity, and enhance recall. Furthermore, for CLIR in unsegmented Asian languages, ...

متن کامل

Ricoh in the NTCIR4 CLIR Tasks

This paper describes Ricoh’s participation in the NTCIR-4 CLIR tasks. We used the same approach as we took at the NTCIR-3 IR tasks for Japanese. We applied our system using a Traditional/Simplified Chinese converter and n-gram indexing for the Chinese IR task. The results show that our simple approach for Chinese IR can provide information retrieval for both Traditional and Simplified Chinese.

متن کامل

NTCIR-6 CLIR Experiments at Osaka Kyoiku University - Term Expansion Using Online Dictionaries and Weighting Score by Term Variety

This paper describes experimental results of J-J subtask of NTCIR-6 CLIR. We expanded query term using online dictionaries in a WEB. It was effective for some topics of which average precision was low. Probabilistic model were employed for scoring, and we modified this score multiplying by the number of varieties of query terms, also. In most cases this works well. Query term reduction should b...

متن کامل

Implicit ambiguity resolution using incremental clustering in cross-language information retrieval

This paper presents a method to implicitly resolve ambiguities using dynamic incremental clustering in cross-language information retrieval (CLIR) such as Korean-to-English and Japanese-to-English CLIR. The main objective of this paper shows that document clusters can effectively resolve the ambiguities tremendously increased in translated queries as well as take into account the context of all...

متن کامل

A Probabilistic Translation Method for Dictionary-based Cross-lingual Information Retrieval in Agglutinative Languages

Translation ambiguity, out of vocabulary words and missing some translations in bilingual dictionaries make dictionary-based Crosslanguage Information Retrieval (CLIR) a challenging task. Moreover, in agglutinative languages which do not have reliable stemmers, missing various lexical formations in bilingual dictionaries degrades CLIR performance. This paper aims to introduce a probabilistic tr...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2004

Integration of PLSA into Probabilistic CLIR Model - Yokohama National University at NTCIR4 CLIR

نویسندگان

چکیده

منابع مشابه

University of Chicago at NTCIR4 CLIR: Multi-Scale Query Expansion

Ricoh in the NTCIR4 CLIR Tasks

NTCIR-6 CLIR Experiments at Osaka Kyoiku University - Term Expansion Using Online Dictionaries and Weighting Score by Term Variety

Implicit ambiguity resolution using incremental clustering in cross-language information retrieval

A Probabilistic Translation Method for Dictionary-based Cross-lingual Information Retrieval in Agglutinative Languages

عنوان ژورنال:

اشتراک گذاری