Exploiting locality for scalable information retrieval in peer-to-peer networks

نویسندگان

  • Demetrios Zeinalipour-Yazti
  • Vana Kalogeraki
  • Dimitrios Gunopulos
چکیده

An important problem in unstructured peer-to-peer (P2P) networks is the efficient content-based retrieval of documents shared by other peers. However, existing searching mechanisms are not scaling well because they are either based on the idea of flooding the network with queries or because they require some form of global knowledge. We propose the Intelligent Search Mechanism (ISM) which is an efficient, scalable yet simple mechanism for improving the information retrieval problem in P2P systems. Our mechanism is efficient since it is bounded by the number of neighbors and scalable because no global knowledge is required to be maintained. ISM consists of four components: A Profiling Structure which logs queryhit messages coming from neighbors, a Query Similarity function which calculates the similarity queries to a new query, RelevanceRank which is an online neighbor ranking function and a Search Mechanism which forwards queries to selected neighbors. We deploy and compare ISM with a number of other distributed search techniques over static and dynamic environments. Our experiments are performed with real data over Peerware, our middleware simulation infrastructure which is deployed on 75 workstations. Our results indicate that ISM outperforms its competitors and that in some cases it manages to achieve 100% recall rate while using only half of the network resources required by its competitors. Further, its performance is also superior with respect to the total query response time and our algorithm exhibits a learning behavior as nodes acquire more knowledge. Finally ISM works well in dynamic network topologies and in environments with replicated data sources.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P2P Network Trust Management Survey

Peer-to-peer applications (P2P) are no longer limited to home users, and start being accepted in academic and corporate environments. While file sharing and instant messaging applications are the most traditional examples, they are no longer the only ones benefiting from the potential advantages of P2P networks. For example, network file storage, data transmission, distributed computing, and co...

متن کامل

A Scalable Method for Taking Detailed and Accurate Geo* Snapshots of Large P2P Networks

Peer-to-peer file-sharing systems offer high transfer speed at remarkably low data ownership cost. In many cases, the transfer speed can be increased if the users possess timely and accurate information over the location of other users. Furthermore, robustness and fault tolerance, issues of intense research investigation, are inextricably linked with the location and activity of the network use...

متن کامل

A Scalable Semantic Indexing Framework for Peer-to-Peer Information Retrieval

The exponential growth of data demands scalable and adaptable infrastructures for indexing and searching a huge amount of data sources with high accuracy and efficiency. Existing centralized search engines are not scalable and suffer from single-point-offailures. The recent work on P2P index construction partitions the document vectors either randomly or statically, making it difficult to trade...

متن کامل

Scalable self-organizing structured P2P information retrieval model based on equivalence classes

This paper proposes a new autonomous self-organizing content-based node clustering peer to peer Information Retrieval (P2PIR) model. This model uses incremental transitive document-to-document similarity technique to build Local Equivalence Classes (LECes) of documents on a source node. Locality Sensitive Hashing (LSH) scheme is applied to map a representative of each LEC into a set of keys whi...

متن کامل

Content-based retrieval of music in scalable peer-to-peer networks

A large portion of data exchanged in today’s Peer-to-Peer (P2P) networks consists of music stored as MP3 compressed audio. Existing P2P systems typically are not scalable and only support primitive methods for the searching of music files, e.g., by looking up exact filenames or using simple metadata information such as artist or album name. In this paper, we present the design and evaluation of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Syst.

دوره 30  شماره 

صفحات  -

تاریخ انتشار 2005