Incorporating Metric Access Methods for Similarity Searching on Oracle Database
نویسندگان
چکیده
The volume of multimedia and complex data (images, videos, audio, time series, DNA sequences, and others) has been growing at a very fast pace. Thus, it is necessary to store in databases many types of data which are not naturally handled by Database Management Systems (DBMSs). Complex data are well-suited to be queried by similarity. Many works addressed techniques for similarity searching, but the majority of them are not conceived to be integrated into a database engine. However, including similarity search into the database core would allow taking advantage of the DBMS resources to perform queries integrating complex and conventional data. Oracle Corp. developed the Oracle interMedia module to support multimedia data in its database manager, providing several operations to manipulate them. It allows performing content-based image retrieval through proprietary functions to extract intrinsic features from images and to compute their similarity. In this paper we describe another module for similarity search, also developed using the Oracle’s Extensible Architecture Framework. Our approach allows including user-defined feature extraction methods and distance functions into the database core, providing a wider flexibility. We also present experiments that show that employing our module to query images by content improves the results obtained using Oracle alone, both in the precision of the results and in the performance of executing queries.
منابع مشابه
Universal Indexing of Arbitrary Similarity Models
The increasing amount of available unstructured content together with the growing number of large non-relational databases put more emphasis on the content-based retrieval and precisely on the area of similarity searching. Although there exist several indexing methods for efficient querying, not all of them are best-suited for arbitrary similarity models. Having a metric space, we can easily ap...
متن کاملOrChem: an open source chemistry search engine for Oracle
BACKGROUND Registration, indexing and searching of chemical structures in relational databases is one of the core areas of cheminformatics. However, little detail has been published on the inner workings of search engines and their development has been mostly closed-source. We decided to develop an open source chemistry extension for Oracle, the de facto database platform in the commercial worl...
متن کاملOn M-tree Variants in Metric and Non-metric Spaces
Although there have been many metric access methods (MAMs) developed so far to solve the problem of similarity searching, there is still big need for gapping retrieval efficiency. One of the most acceptable MAMs is M-tree which meets the essential features important for large, persistent and dynamic databases. M-tree’s retrieval inefficiency is hidden in overlaps of its regions, therefore, its ...
متن کاملAn Index Data Structure for Searching in Metric Space Databases
This paper presents the Evolutionary Geometric Near-neighbor Access Tree (EGNAT) which is a new data structure devised for searching in metric space databases. The EGNAT is fully dynamic, i.e., it allows combinations of insert and delete operations, and has been optimized for secondary memory. Empirical results on different databases show that this tree achieves good performance for high-dimens...
متن کاملMetric Indexing of Protein Databases and Promising Approaches
Most widely used biological databases nowadays are nucleotide and protein ones. These databases are crucial for determination of biological functions of living organisms with respect to their DNA structure. The biological function of a protein can be derived from the similarity with another protein with known function which is stored in a database and therefore the chance of finding the biologi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009