Performance of Multi-Level and Multi-Component Compressed Bitmap Indexes
نویسندگان
چکیده
Bitmap indexes are known as the most effective indexing methods for range queries on append-only data, especially for low cardinality attributes. Recently, bitmap indexes were also shown to be just as effective for high cardinality attributes when certain compression methods are applied. There are many different bitmap indexes in the literature but no definite comparison among them has been made, largely because there is no accurate prediction of their index sizes and search time. This paper presents a systematic evaluation of two large subsets of compressed bitmap indexes that use multi-component and multi-level encodings. We combine extensive analyses with ample experimental results to confirm them, whereas earlier studies of these indexes are either empirical or for uncompressed indexes only. Our analyses provide highly accurate predictions that agree with test measurements. These analyses not only identify the best methods in terms of index size and query processing cost, but also reveal new ways of using multi-level methods that significantly improve their performance. Using the best parameters obtained through analyses, we produce three two-level indexes with the optimal computational complexity. Furthermore, the fastest two-level indexes are predicted and observed to be 5 to 10 times faster than other well-known indexes.
منابع مشابه
Massive-Scale RDF Processing Using Compressed Bitmap Indexes
The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scientific data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern-finding queries on this implicit multigraph in a SQLlike syntax. SPARQL quer...
متن کاملBetter bitmap performance with Roaring bitmaps
Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes. Following Oracle’s lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it...
متن کاملThe Dimension-Join: A New Index for Data Warehouses
There are several auxiliary pre-computed access structures that allow faster answers by reading less base data. Examples are materialized views, join indexes, B-tree and bitmap indexes. This paper proposes dimension-join, a new type of index especially suited for data warehouses. The dimension-join borrows ideas from several concepts. It is a bitmap index, it is a multi-table join and when bein...
متن کاملFast Set Intersection through Run-Time Bitmap Construction over PForDelta-Compressed Indexes
Set intersection is a fundamental operation for evaluating conjunctive queries in the context of scientific data analysis. The state-of-the-art approach in performing set intersection, compressed bitmap indexing, achieves high computational efficiency because of cheap bitwise operations; however, overall efficiency is often nullified by the HPC I/O bottleneck, because compressed bitmap indexes ...
متن کاملSBH: Super byte-aligned hybrid bitmap compression
Bitmap indexes are commonly used in data warehousing applications such as on-line analytic processing (OLAP). Storing the bitmaps in compressed form has been shown to be effective not only for low cardinality attributes, as conventional wisdom would suggest, but also for high cardinality attributes. Compressed bitmap indexes, such as Byte-aligned been shown to be efficient in terms of both time...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007