Approximate similarity search in metric spaces using inverted files
نویسندگان
چکیده
We propose a new approach to perform approximate similarity search in metric spaces. The idea at the basis of this technique is that when two objects are very close one to each other they ’see’ the world around them in the same way. Accordingly, we can use a measure of dissimilarity between the view of the world, from the perspective of the two objects, in place of the distance function of the underlying metric space. To exploit this idea we represent each object of a dataset by the ordering of a number of reference objects of the metric space according to their distance from the object itself. In order to compare two objects of the dataset we compare the two corresponding orderings of the reference objects. We show that efficient and effective approximate similarity searching can be obtained by using inverted files, relying on this idea. We show that the proposed approach performs better than other approaches in literature.
منابع مشابه
Approximate Similarity Search from another ” Perspective ” ( EXTENDED ABSTRACT ) ?
We propose a new approach to perform approximate similarity search in metric spaces [8]. The idea at the basis of this technique is that when two objects are very close one to each other they ’see’ the world around them in the same way. Accordingly, we can use a measure of dissimilarity between the view of the world, from the perspective of the two objects, in place of the distance function of ...
متن کاملNon-Metric Space Library Manual
This document describes a library for similarity searching. Even though the library contains a variety of metric-space access methods, our main focus is on search methods for non-metric spaces. Because there are fewer exact solutions for non-metric spaces, many of our methods give only approximate answers. Thus, the methods are evaluated in terms of efficiency-effectiveness trade-offs rather th...
متن کاملUsing the Distance Distribution for Approximate Similarity Queries in High-Dimensional Metric Spaces
We investigate the problem of approximate similarity (nearest neighbor) search in high-dimensional metric spaces, and describe how the distance distribution of the query object can be exploited so as to provide probabilistic guarantees on the quality of the result. This leads to a new paradigm for similarity search, called PAC-NN (probably approximately correct nearest neighbor) queries, aiming...
متن کاملAccess Structures for Advanced Similarity Search in Metric Spaces
Similarity retrieval is an important paradigm for searching in environments where exact match has little meaning. Moreover, in order to enlarge the set of data types for which the similarity search can efficiently be performed, the notion of mathematical metric space provides a useful abstraction for similarity. In this paper we consider the problem of organizing and searching large data-sets f...
متن کاملSimilarity Measures for Relational Databases
We enrich sets with an integrated notion of similarity, measured in a (complete) lattice, special cases of which are reflexive sets and bounded metric spaces. Relations and basic relational operations of traditional relational algebra are interpreted in such richer structured environments. An canonical similarity measure between relations is introduced. In the special case of reflexive sets it ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008