Similarity join size estimation using locality sensitive hashing

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Similarity Join Size Estimation using Locality Sensitive Hashing

Similarity joins are important operations with a broad range of applications. In this paper, we study the problem of vector similarity join size estimation (VSJ). It is a generalization of the previously studied set similarity join size estimation (SSJ) problem and can handle more interesting cases such as TF-IDF vectors. One of the key challenges in similarity join size estimation is that the ...

متن کامل

Similarity Search and Locality Sensitive Hashing using TCAMs

Similarity search methods are widely used as kernels in various data mining and machine learning applications including those in computational biology, web search/clustering. Nearest neighbor search (NNS) algorithms are often used to retrieve similar entries, given a query. While there exist efficient techniques for exact query lookup using hashing, similarity search using exact nearest neighbo...

متن کامل

Beyond Locality-Sensitive Hashing

We present a new data structure for the c-approximate near neighbor problem (ANN) in the Euclidean space. For n points in R, our algorithm achieves Oc(n + d logn) query time and Oc(n + d logn) space, where ρ ≤ 7/(8c2) + O(1/c3) + oc(1). This is the first improvement over the result by Andoni and Indyk (FOCS 2006) and the first data structure that bypasses a locality-sensitive hashing lower boun...

متن کامل

Fractal Image Compression Self-Similarity via Locality Sensitive Hashing

In this paper I describe a Haskell implementation of fractal image compression, a lossy image compression technique that leverages self-similarity within an image to produce an encoding. Known for its lengthy encoding time, fractal image encoding implementations require the most cleverness in identifying highly self-similar image regions. In this paper, I describe a simple locality sensitive ha...

متن کامل

Bayesian Locality Sensitive Hashing for Fast Similarity Search

Given a collection of objects and an associated similarity measure, the all-pairs similarity search problem asks us to find all pairs of objects with similarity greater than a certain user-specified threshold. Locality-sensitive hashing (LSH) based methods have become a very popular approach for this problem. However, most such methods only use LSH for the first phase of similarity search i.e. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the VLDB Endowment

سال: 2011

ISSN: 2150-8097

DOI: 10.14778/1978665.1978666