Performance of Multi-Level and Multi-Component Compressed Bitmap Indexes

نویسندگان

  • Kesheng Wu
  • Kurt Stockinger
  • Arie Shoshani
چکیده

Bitmap indexes are known as the most effective indexing methods for range queries on append-only data, especially for low cardinality attributes. Recently, bitmap indexes were also shown to be just as effective for high cardinality attributes when certain compression methods are applied. There are many different bitmap indexes in the literature but no definite comparison among them has been made, largely because there is no accurate prediction of their index sizes and search time. This paper presents a systematic evaluation of two large subsets of compressed bitmap indexes that use multi-component and multi-level encodings. We combine extensive analyses with ample experimental results to confirm them, whereas earlier studies of these indexes are either empirical or for uncompressed indexes only. Our analyses provide highly accurate predictions that agree with test measurements. These analyses not only identify the best methods in terms of index size and query processing cost, but also reveal new ways of using multi-level methods that significantly improve their performance. Using the best parameters obtained through analyses, we produce three two-level indexes with the optimal computational complexity. Furthermore, the fastest two-level indexes are predicted and observed to be 5 to 10 times faster than other well-known indexes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Massive-Scale RDF Processing Using Compressed Bitmap Indexes

The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scientific data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern-finding queries on this implicit multigraph in a SQLlike syntax. SPARQL quer...

متن کامل

Better bitmap performance with Roaring bitmaps

Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes. Following Oracle’s lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it...

متن کامل

The Dimension-Join: A New Index for Data Warehouses

There are several auxiliary pre-computed access structures that allow faster answers by reading less base data. Examples are materialized views, join indexes, B-tree and bitmap indexes. This paper proposes dimension-join, a new type of index especially suited for data warehouses. The dimension-join borrows ideas from several concepts. It is a bitmap index, it is a multi-table join and when bein...

متن کامل

Fast Set Intersection through Run-Time Bitmap Construction over PForDelta-Compressed Indexes

Set intersection is a fundamental operation for evaluating conjunctive queries in the context of scientific data analysis. The state-of-the-art approach in performing set intersection, compressed bitmap indexing, achieves high computational efficiency because of cheap bitwise operations; however, overall efficiency is often nullified by the HPC I/O bottleneck, because compressed bitmap indexes ...

متن کامل

SBH: Super byte-aligned hybrid bitmap compression

Bitmap indexes are commonly used in data warehousing applications such as on-line analytic processing (OLAP). Storing the bitmaps in compressed form has been shown to be effective not only for low cardinality attributes, as conventional wisdom would suggest, but also for high cardinality attributes. Compressed bitmap indexes, such as Byte-aligned been shown to be efficient in terms of both time...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007