Optimizing Performance of Aggregate Query Processing with Histogram Data Structure

被引：0

作者：

Liang Yong ^{[1
]}

Mu Zhaonan ^{[1
]}

机构：

[1] Guizhou Univ Commerce, Network & Informat Ctr, Guiyang 550014, Guizhou, Peoples R China

来源：

SOFTWARE ENGINEERING METHODS IN INTELLIGENT ALGORITHMS, VOL 1 | 2019年 / 984卷

关键词：

Massive data; Approximate query processing; Histogram; Aggregate query; Performance optimization;

D O I：

10.1007/978-3-030-19807-7_33

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In today's big data era, the capability of analyze massive data efficient and return the results within an short time limit is critical to decision making, thus many big data system proposed and various distributed and parallel processing techniques are heavily investigated. Among previous research, most of them are working on precise query processing, while approximate query processing (AQP) techniques which make interactive data exploration more efficiently and allows users to tradeoff between query accuracy and response time have not been investigate comprehensively. In this paper, we study the characteristics of aggregate query, a typical type of analytical query, and proposed an approximate query processing approach to optimize the execution of massive data based aggregate query with a histogram data structure. We implemented this approach into big data system Hive and compare it with Hive and AQP-enabled big data system BlinkDB, the experimental results verified that our approach is significantly fast than these existing systems in most scenarios.

引用

页码：342 / 350

页数：9

共 22 条

[1] Storing and Querying Tree-Structured Records in Dremel
Afrati, Foto N.
Delorey, Dan
Pasumansky, Mosha
Ullman, Jeffrey D.
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (12): : 1131 - 1142
[2] Agarwal S., 2013, P 8 ACM EUR C COMP S, P29
[3] Knowing When You're Wrong: Building Fast and Reliable Approximate Query Processing Systems
Agarwal, Sameer
Milner, Henry
Kleiner, Ariel
Talwalkar, Ameet
Jordan, Michael
Madden, Samuel
Mozafari, Barzan
Stoica, Ion
[J]. SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 481 - 492
[4] Blink and It's Done: Interactive Queries on Very Large Data
Agarwal, Sameer
Panda, Aurojit
Mozafari, Barzan
Iyer, Anand P.
Madden, Samuel
Stoica, Ion
[J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (12): : 1902 - 1905
[5] [Anonymous], 2015, P 7 BIENN C INN DAT
[6] [Anonymous], ACM SIGMOD
[7] [Anonymous], 2017, CIDR 2017
[8] [Anonymous], APPL HISTOGRAM METHO
[9] Spark SQL: Relational Data Processing in Spark
Armbrust, Michael
Xin, Reynold S.
Lian, Cheng
Huai, Yin
Liu, Davies
Bradley, Joseph K.
Meng, Xiangrui
Kaftan, Tomer
Franklint, Michael J.
Ghodsi, Ali
Zaharia, Matei
[J]. SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 1383 - 1394
[10] Approximate Query Processing: No Silver Bullet
Chaudhuri, Surajit
Ding, Bolin
Kandula, Srikanth
[J]. SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2017, : 511 - 519

← 1 2 3 →