Optimizing Performance of Aggregate Query Processing with Histogram Data Structure

被引:0
作者
Liang Yong [1 ]
Mu Zhaonan [1 ]
机构
[1] Guizhou Univ Commerce, Network & Informat Ctr, Guiyang 550014, Guizhou, Peoples R China
来源
SOFTWARE ENGINEERING METHODS IN INTELLIGENT ALGORITHMS, VOL 1 | 2019年 / 984卷
关键词
Massive data; Approximate query processing; Histogram; Aggregate query; Performance optimization;
D O I
10.1007/978-3-030-19807-7_33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In today's big data era, the capability of analyze massive data efficient and return the results within an short time limit is critical to decision making, thus many big data system proposed and various distributed and parallel processing techniques are heavily investigated. Among previous research, most of them are working on precise query processing, while approximate query processing (AQP) techniques which make interactive data exploration more efficiently and allows users to tradeoff between query accuracy and response time have not been investigate comprehensively. In this paper, we study the characteristics of aggregate query, a typical type of analytical query, and proposed an approximate query processing approach to optimize the execution of massive data based aggregate query with a histogram data structure. We implemented this approach into big data system Hive and compare it with Hive and AQP-enabled big data system BlinkDB, the experimental results verified that our approach is significantly fast than these existing systems in most scenarios.
引用
收藏
页码:342 / 350
页数:9
相关论文
共 22 条
  • [1] Storing and Querying Tree-Structured Records in Dremel
    Afrati, Foto N.
    Delorey, Dan
    Pasumansky, Mosha
    Ullman, Jeffrey D.
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (12): : 1131 - 1142
  • [2] Agarwal S., 2013, P 8 ACM EUR C COMP S, P29
  • [3] Knowing When You're Wrong: Building Fast and Reliable Approximate Query Processing Systems
    Agarwal, Sameer
    Milner, Henry
    Kleiner, Ariel
    Talwalkar, Ameet
    Jordan, Michael
    Madden, Samuel
    Mozafari, Barzan
    Stoica, Ion
    [J]. SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 481 - 492
  • [4] Blink and It's Done: Interactive Queries on Very Large Data
    Agarwal, Sameer
    Panda, Aurojit
    Mozafari, Barzan
    Iyer, Anand P.
    Madden, Samuel
    Stoica, Ion
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 1902 - 1905
  • [5] [Anonymous], 2015, P 7 BIENN C INN DAT
  • [6] [Anonymous], ACM SIGMOD
  • [7] [Anonymous], 2017, CIDR 2017
  • [8] [Anonymous], APPL HISTOGRAM METHO
  • [9] Spark SQL: Relational Data Processing in Spark
    Armbrust, Michael
    Xin, Reynold S.
    Lian, Cheng
    Huai, Yin
    Liu, Davies
    Bradley, Joseph K.
    Meng, Xiangrui
    Kaftan, Tomer
    Franklint, Michael J.
    Ghodsi, Ali
    Zaharia, Matei
    [J]. SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 1383 - 1394
  • [10] Approximate Query Processing: No Silver Bullet
    Chaudhuri, Surajit
    Ding, Bolin
    Kandula, Srikanth
    [J]. SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, : 511 - 519