Condensed Cube: An Effective Approach to Reducing Data Cube Size

نویسندگان

  • Wei Wang
  • Jianlin Feng
  • Hongjun Lu
  • Jeffrey Xu Yu
چکیده

Pre-computed data cube facilitates OLAP (On-Line Analytical Processing). It is a well-known fact that data cube computation is an expensive operation, which attracts a lot of attention. While most proposed algorithms devoted themselves to optimizing memory management and reducing computation costs, less work addresses one of the fundamental issues: the size of a data cube is huge when a large base relation with a large number of attributes is involved. In this paper, we propose a new concept, called a condensed data cube. The condensed cube is of much smaller size of a complete non-condensed cube. More importantly, it is a fully pre-computed cube without compression, and, hence, it requires neither decompression nor further aggregation when answering queries. Several algorithms for computing condensed cube are proposed. Results of experiments on the effectiveness of condensed data cube are presented, using both synthetic and real-world data . The results indicate that the proposed condensed cube can reduce both the cube size and therefore its computation time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Extension of Cube Attack with Probabilistic Equations and its Application on Cryptanalysis of KATAN Cipher

Cube Attack is a successful case of Algebraic Attack. Cube Attack consists of two phases, linear equation extraction and solving the extracted equation system. Due to the high complexity of equation extraction phase in finding linear equations, we can extract nonlinear ones that could be approximated to linear equations with high probability. The probabilistic equations could be considered as l...

متن کامل

Multidimensional cyclic graph approach: Representing a data cube without common sub-graphs

We present a new full cube computation technique and a cube storage representation approach, called the multidimensional cyclic graph (MCG) approach. The data cube relational operator has exponential complexity and therefore its materialization involves both a huge amount of memory and a substantial amount of time. Reducing the size of data cubes, without a loss of generality, thus becomes a fu...

متن کامل

Emerging Cubes: Borders, size estimations and lossless reductions

Discovering trend reversals between two data cubes provides users with a novel and interesting knowledge when the real world context fluctuates: What is new? Which trends appear or emerge? Which tendencies are immersing or disappear? With the concept of Emerging Cube, we capture such trend reversals by enforcing an emergence constraint. We resume the classical borders for the Emerging Cube and ...

متن کامل

Using Functional Dependencies for Reducing the Size of a Data Cube

Functional dependencies (FD’s) are a powerful concept in data organization. They have been proven very useful in e.g., relational databases for reducing data redundancy. Little work however has been done so far for using them in the context of data cubes. In the present paper, we propose to characterize the parts of a data cube to be materialized with the help of the FD’s present in the underly...

متن کامل

The Dwarf Data Cube Eliminates the High Dimensionality Curse

The data cube operator encapsulates all possible groupings of a data set and has proved to be an invaluable tool in analyzing vast amounts of data. However its apparent exponential complexity has significantly limited its applicability to low dimensional datasets. Recently the idea of the dwarf data cube model was introduced, and showed that highdimensional “dwarf data cubes” are orders of magn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002