Exploiting Hyperlinks to Learn a Retrieval Model

نویسندگان

  • David Grangier
  • Samy Bengio
چکیده

Information Retrieval (IR) aims at solving a ranking problem: given a query q and a corpus C, the documents of C should be ranked such that the documents relevant to q appear above the others. This task is generally performed by ranking the documents d ∈ C according to their similarity with respect to q, sim(q, d). The identification of an effective function a, b → sim(a, b) could be performed using a large set of queries with their corresponding relevance assessments. However, such data are especially expensive to label, thus, as an alternative, we propose to rely on hyperlink data which convey analogous semantic relationships. We then empirically show that a measure sim inferred from hyperlinked documents can actually outperform the state-of-the-art Okapi approach, when applied over a non-hyperlinked retrieval corpus.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting hyperlinks for automatic information discovery on the WWW

The explosion of the World Wide Web as a global information network brings with it a number of related challenges for information retrieval and automation. The link structure, which is the main feature of the hypermedia environment, can be a rich source of information for exploration. This paper is centered around the exploiting of hyperlinks in the subject of automatic discovery. In this paper...

متن کامل

Enhancing retrieval with hyperlinks: A general model based on propositional argumentation systems

Fast, effective, and adaptable techniques are needed to automatically organize and retrieve information on the ever-increasing World Wide Web. In that respect, different strategies have been suggested to take hypertext links into account. For example, hyperlinks have been used to (1) enhance document representation, (2) improve document ranking by propagating document score, (3) provide an indi...

متن کامل

Exploiting Hyperlinks for Automatic Information Discovery on the WWW - Tools with Artificial Intelligence, 1998. Proceedings. Tenth IEEE International Conference on

The explosion of the World Wide Web as a global information network brings with it a number of related challenges for information retrieval and automation. The link structure, which is the main feature of the hypermedia environment, can be a rich source of information for exploration. This paper is centered around the exploiting of hyperlinks in the subject of automatic discovery. In this paper...

متن کامل

Impact of placing icons next to hyperlinks on information-retrieval tasks on the web

Though several studies have demonstrated the usefulness of pictures in multimedia learning, memory, cognitive load and visual search, there have been very few attempts to study their impact in the web-navigation scenario. Also, cognitive models of web-navigation (like CoLiDeS, CoLiDeS+) ignore the information from visual modality and focus solely on the information from text. We conducted an ex...

متن کامل

بازیابی تعاملی تصاویر طبیعت با بهره گیری از یادگیری چند نمونه ای

Content-based image retrieval (CBIR) has received considerable research interest in the recent years. The basic problem in CBIR is the semantic gap between the high-level image semantics and the low-level image features. Region-based image retrieval and learning from user interaction through relevance feedback are two main approaches to solving this problem. Recently, the research in integra...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005