A robust, optimization-based approach for approximate answering of aggregate queries

被引:0
作者
Chaudhuri, S [1 ]
Das, G [1 ]
Narasayya, V [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The ability to approximately answer aggregation queries accurately and efficiently is of great benefit for decision support and data mining tools. In contrast to previous sampling-based studies, we treat the problem as an optimization problem whose goal is to minimize the error in answering queries in the given workload. A key novelty of our approach is that we can tailor the choice of samples to be robust even for workloads that are "similar" but not necessarily identical to the given workload. Finally, our techniques recognize the importance of taking into account the variance in the data distribution in a principled manner. We show how our solution can be implemented on a database system, and present results of extensive experiments on Microsoft SQL Server 2000 that demonstrate the superior quality of our method compared to previous work.
引用
收藏
页码:295 / 306
页数:12
相关论文
共 23 条
  • [1] ACHARYA S, 2000, P ACM SIGMOD
  • [2] ACHARYA S, 1999, P ACM SIGMOD
  • [3] BARBARA D, 1997, SIGMOD RECORD, V26
  • [4] BARBARA D, 1999, P 1999 ACM SIGKDD IN
  • [5] CHAKRABARTI K, 2000, P VLDB
  • [6] CHAUDHURI S, 2001, P IEEE C DAT ENG
  • [7] Chaudhuri S., PROGRAM TPC D DATA G
  • [8] CHAUDHURI S, 1999, P ACM SIGMOD
  • [9] CHAUDHURI S, MSRTR200137
  • [10] Cochran, 1977, SAMPLING TECHNIQUES