SUM-optimal histograms for approximate query processing

被引:0
作者
Meifan Zhang
Hongzhi Wang
Jianzhong Li
Hong Gao
机构
[1] Harbin Institute of Technology,Department of Computer Science and Technology
来源
Knowledge and Information Systems | 2020年 / 62卷
关键词
Approximate query processing; Histogram; Big data;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we study the problem of the SUM query approximation with histograms. We define a new kind of histogram called the SUM-optimal histogram which can provide better estimation result for the SUM queries than the traditional equi-depth and V-optimal histograms. We propose three methods for the histogram construction. The first one is a dynamic programming method, and the other two are approximate methods. We use a greedy strategy to insert separators into a histogram and use the stochastic gradient descent method to improve the accuracy of separators. The experimental results indicate that our method can provide better estimations for the SUM queries than the equi-depth and V-optimal histograms.
引用
收藏
页码:3155 / 3180
页数:25
相关论文
共 60 条
  • [1] Buccafurri F(2011)A quad-tree based multiresolution approach for two-dimensional summary data Inf Syst 36 1082-1103
  • [2] Furfaro F(2008)Enhancing histograms by tree-like bucket indices VLDB J 17 1041-1061
  • [3] Mazzeo GM(2008)SCOPE: easy and efficient parallel processing of massive data sets PVLDB 1 1265-1276
  • [4] Saccà D(2012)Synopses for massive data: samples, histograms, wavelets, sketches Found Trends Databases 4 1-294
  • [5] Buccafurri F(2019)Privacy-preserving multi-keyword top- IEEE Trans Dependable Sec Comput 16 344-357
  • [6] Lax G(2019) k similarity search over encrypted data Inf Sci 493 20-33
  • [7] Saccà D(2017)Privacy preserving similarity joins using mapreduce PVLDB 10 1142-1153
  • [8] Pontieri L(2002)Revisiting reuse for approximate query processing ACM Trans Database Syst 27 261-298
  • [9] Rosaci D(2006)Fast incremental maintenance of approximate histograms ACM Trans Database Syst 31 396-438
  • [10] Chaiken R(2018)Approximation and streaming algorithms for histogram construction problems Data Sci Eng 3 379-397