Fast Set Intersection in Memory

نویسندگان

  • Bolin Ding
  • Arnd Christian König
چکیده

Set intersection is a fundamental operation in information retrieval and database systems. This paper introduces linear space data structures to represent sets such that their intersection can be computed in a worst-case efficient way. In general, given k (preprocessed) sets, with totally n elements, we will show how to compute their intersection in expected time O(n/ √ w + kr), where r is the intersection size and w is the number of bits in a machine-word. In addition,we introduce a very simple version of this algorithm that has weaker asymptotic guarantees but performs even better in practice; both algorithms outperform the state of the art techniques for both synthetic and real data sets and workloads.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Some lower bounds for the $L$-intersection number of graphs

‎For a set of non-negative integers~$L$‎, ‎the $L$-intersection number of a graph is the smallest number~$l$ for which there is an assignment of subsets $A_v subseteq {1,dots‎, ‎l}$ to vertices $v$‎, ‎such that every two vertices $u,v$ are adjacent if and only if $|A_u cap A_v|in L$‎. ‎The bipartite $L$-intersection number is defined similarly when the conditions are considered only for the ver...

متن کامل

On the Security of O-PSI a Delegated Private Set Intersection on Outsourced Datasets (Extended Version)

In recent years, determining the common information privately and efficiently between two mutually mistrusting parties have become an important issue in social networks. Many Private set intersection (PSI) protocols have been introduced to address this issue. By applying these protocols, two parties can compute the intersection between their sets without disclosing any information about compone...

متن کامل

Fast Set Intersection through Run-Time Bitmap Construction over PForDelta-Compressed Indexes

Set intersection is a fundamental operation for evaluating conjunctive queries in the context of scientific data analysis. The state-of-the-art approach in performing set intersection, compressed bitmap indexing, achieves high computational efficiency because of cheap bitwise operations; however, overall efficiency is often nullified by the HPC I/O bottleneck, because compressed bitmap indexes ...

متن کامل

Fast Set Intersection and Two-Patterns Matching

In this paper we present a new problem, the fast set intersection problem, which is to preprocess a collection of sets in order to efficiently report the intersection of any two sets in the collection. In addition we suggest new solutions for the two-dimensional substring indexing problem and the document listing problem for two patterns by reduction to the fast set intersection problem.

متن کامل

SOME RESULTS ON THE COMPLEMENT OF THE INTERSECTION GRAPH OF SUBGROUPS OF A FINITE GROUP

Let G be a group. Recall that the intersection graph of subgroups of G is an undirected graph whose vertex set is the set of all nontrivial subgroups of G and distinct vertices H,K are joined by an edge in this graph if and only if the intersection of H and K is nontrivial. The aim of this article is to investigate the interplay between the group-theoretic properties of a finite group G and the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2011