New algorithm for computing cube on very large compressed data sets

被引:5
作者
IEEE [1 ]
不详 [2 ]
不详 [3 ]
机构
[1] Department of Computer Science and Engineering, University of Texas at Dallas, Mail Station EC31, Richardson
[2] Harbin Institute of Technology, Mail Box 750
来源
IEEE Trans Knowl Data Eng | 2006年 / 12卷 / 1667-1680期
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Cube operation; Data compression; Data warehouses; OLAP;
D O I
10.1109/TKDE.2006.195
中图分类号
学科分类号
摘要
Data compression is an effective technique to improve the performance of data warehouses. Since cube operation represents the core of online analytical processing in data warehouses, it is a major challenge to develop efficient algorithms for computing cube on compressed data warehouses. To our knowledge, very few cube computation techniques have been proposed for compressed data warehouses to date in the literature. This paper presents a novel algorithm to compute cubes on compressed data warehouses. The algorithm operates directly on compressed data sets without the need of first decompressing them. The algorithm is applicable to a large class of mapping complete data compression methods. The complexity of the algorithm is analyzed in detail. The analytical and experimental results show that the algorithm is more efficient than all other existing cube algorithms. In addition, a heuristic algorithm to generate an optimal plan for computing cube is also proposed. © 2006 IEEE.
引用
收藏
页码:1667 / 1680
页数:13
相关论文
共 30 条
  • [1] Yazdani S., Wong S., Data Warehousing with Oracle, (1997)
  • [2] Gupta V.R., Data Warehousing with MS SQL Server Unleashed, (1977)
  • [3] Chatziantonian D., Ross K., Querying multiple features in relational databases, Proc. 22nd Int'l Conf. Very Large Databases, (1996)
  • [4] The Role of Multidimensional Database in A Data Warehousing Solution/'white Paper, (2006)
  • [5] Inmon W.H., Multidimensional databases and data warehousing, Data Management Rev., (1995)
  • [6] Colliat G., OLAP, Relational, and multidimensional databases systems, SIGMOD Record, 25, 3, (1996)
  • [7] Gray J., Et al., Data cube: A relational aggregation operator generalizing group-by, cross-tables, and sub-totals, Data Mining and Knowkdge Discovery, 1, 1, pp. 29-53, (1997)
  • [8] Agarwal S., Et al., On the computation of multidimensional aggregates, Proc. 22nd Very Large Data Bases Conf., pp. 506-521, (1996)
  • [9] Harinarayan V., Rajaraman A., Ullman J.D., Implementing data cubes efficiently, Proc. ACM SIGMOD Conf., pp. 205-216, (1996)
  • [10] Ross K.A., Srivastava D., Fast computation of sparse datacubes, Proc. 23rd Int'l Conf. Very Large Data Bases, pp. 116-125, (1997)