Multiscale Quantization for Fast Similarity Search

نویسندگان

  • Xiang Wu
  • Ruiqi Guo
  • Ananda Theertha Suresh
  • Sanjiv Kumar
  • Daniel N. Holtmann-Rice
  • David Simcha
  • Felix X. Yu
چکیده

We propose a multiscale quantization approach for fast similarity search on large, high-dimensional datasets. The key insight of the approach is that quantization methods, in particular product quantization, perform poorly when there is large variance in the norms of the data points. This is a common scenario for realworld datasets, especially when doing product quantization of residuals obtained from coarse vector quantization. To address this issue, we propose a multiscale formulation where we learn a separate scalar quantizer of the residual norm scales. All parameters are learned jointly in a stochastic gradient descent framework to minimize the overall quantization error. We provide theoretical motivation for the proposed technique and conduct comprehensive experiments on two large-scale public datasets, demonstrating substantial improvements in recall over existing state-of-the-art methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast image search on a VQ compressed image database

A fast and efficient image search method is developed for a compressed image database using vector quantization (VQ). An image search on an image database requires an exhaustive sequential scan of all the images, given the similarity measure. If compressed images are dealt with, images are decompressed as an initial operation and then the previously mentioned exhaustive search is performed usin...

متن کامل

Hamming Compatible Quantization for Hashing

Hashing is one of the effective techniques for fast Approximate Nearest Neighbour (ANN) search. Traditional single-bit quantization (SBQ) in most hashing methods incurs lots of quantization error which seriously degrades the search performance. To address the limitation of SBQ, researchers have proposed promising multi-bit quantization (MBQ) methods to quantize each projection dimension with mu...

متن کامل

Image representation and processing through multiscale local jet features

We propose a unified framework for representing and processing images using a feature space related to local similarity. We choose the multiscale and versatile local jet feature space to represent the visual data. This feature space may be reduced by vector quantisation and/or be represented by data structures enabling efficient nearest neighbours search (e.g. kd-trees). We show the interest of...

متن کامل

Multiscale modeling and estimation of motion fields for video coding

We present a systematic approach to forward-motion-compensated predictive video coding. The first step is the definition of a flexible model that compactly represents motion fields. The inhomogeneity and spatial coherence properties of motion fields are captured using linear multiscale models. One possible design is based on linear finite elements and yields a multiscale extension of the triang...

متن کامل

Angular Quantization-based Binary Codes for Fast Similarity Search

This paper focuses on the problem of learning binary codes for efficient retrieval of high-dimensional non-negative data that arises in vision and text applications where counts or frequencies are used as features. The similarity of such feature vectors is commonly measured using the cosine of the angle between them. In this work, we introduce a novel angular quantization-based binary coding (A...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017