Approximate Query Answering Using Data Warehouse Striping

被引:0
作者
Jorge R. Bernardino
Pedro S. Furtado
Henrique C. Madeira
机构
[1] Polytechnic of Coimbra,DEI, Pólo II
[2] ISEC,undefined
[3] DEIS,undefined
[4] University of Coimbra,undefined
来源
Journal of Intelligent Information Systems | 2002年 / 19卷
关键词
data warehousing; distributed query optimization; data partitioning; performance optimization; approximate query answering;
D O I
暂无
中图分类号
学科分类号
摘要
This paper presents and evaluates a simple but very effective method to implement large data warehouses on an arbitrary number of computers, achieving very high query execution performance and scalability. The data is distributed and processed in a potentially large number of autonomous computers using our technique called data warehouse striping (DWS). The major problem of DWS technique is that it would require a very expensive cluster of computers with fault tolerant capabilities to prevent a fault in a single computer to stop the whole system. In this paper, we propose a radically different approach to deal with the problem of the unavailability of one or more computers in the cluster, allowing the use of DWS with a very large number of inexpensive computers. The proposed approach is based on approximate query answering techniques that make it possible to deliver an approximate answer to the user even when one or more computers in the cluster are not available. The evaluation presented in the paper shows both analytically and experimentally that the approximate results obtained this way have a very small error that can be negligible in most of the cases.
引用
收藏
页码:145 / 167
页数:22
相关论文
共 11 条
  • [1] Barbara D.(1997)The New Jersey Data Reduction Report Bulletin of the Technical Committee on Data Engineering 20 3-45
  • [2] Chauduri S.(1997)An Overview of DataWarehousing and OLAP Technology SIGMOD Record 26 65-74
  • [3] Dayal U.(1990)The Gamma Database Machine Project IEEE Trans. Knowledge and Data Engineering 2 44-62
  • [4] DeWitt D.J.(1992)Parallel Database Systems: The Future of High Performance Database Systems Communications of the ACM 35 85-98
  • [5] DeWitt D.J.(1995)Query Size Estimation by Adaptive Sampling J. Computer and System Sciences 51 18-25
  • [6] Gray J.(1999)Approximate Query Answering Using Histograms IEEE Data Engineering Bulletin 22 5-14
  • [7] Lipton R.J.(undefined)undefined undefined undefined undefined-undefined
  • [8] Naughton J.F.(undefined)undefined undefined undefined undefined-undefined
  • [9] Poosala V.(undefined)undefined undefined undefined undefined-undefined
  • [10] Ganti V.(undefined)undefined undefined undefined undefined-undefined