Approximate Query Processing Based on Approximate Materialized View

被引:0
作者
Wu, Yuhan [1 ]
Guo, Haifeng [1 ]
Yang, Donghua [1 ]
Li, Mengmeng [1 ]
Zheng, Bo [2 ]
Wang, Hongzhi [1 ]
机构
[1] Harbin Inst Technol, Harbin, Peoples R China
[2] ConDB, Beijing, Peoples R China
来源
ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT II | 2024年 / 14488卷
关键词
Approximate materialized view; Materialized views reuse; AQP plus plus optimization; Approximate query processing; GROUP-BY;
D O I
10.1007/978-981-97-0801-7_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the context of big data, the interactive analysis database system needs to answer aggregate queries within a reasonable response time. The proposed AQP++ framework can integrate data preprocessing and AQP. It connects existing AQP engine with data preprocessing method to complete the connection between them in the process of interaction analysis. After the research on the application of materialized views in AQP++ framework, it is found that the materialized views used in the two parts of the framework both come from the accurate results of precomputation, so there's still a time bottleneck under large scale data. Based on such limitations, we proposed to use approximate materialized views for subsequent results reuse. We take the method of identifying approximate interval as an example, compared the improvement of AQP++ by using approximate materialized view, and trying different sampling methods to find better time and accurate performance results. By constructed larger samples, we compared the differences of time, space and accuracy between approximate and general materialized views in AQP++, and analyzed the reasons for the poor performance in some cases of our methods. Based on the experimental results, it proved that the use of approximate materialized view can improve the AQP++ framework, it effectively save time and storage space in the preprocessing stage, and obtain the accuracy similar to or better than the general AQP results as well.
引用
收藏
页码:168 / 185
页数:18
相关论文
共 14 条
  • [1] Acharya S, 2000, SIGMOD REC, V29, P487
  • [2] Agarwal S., 2013, P 8 ACM EUROPEAN C C, P29, DOI 10. 1145/2465351.2465355
  • [3] Babcock B., 2003, 2003 ACM SIGMOD INT
  • [4] Data Driven Approximation with Bounded Resources
    Cao, Yang
    Fan, Wenfei
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (09): : 973 - 984
  • [5] Overcoming limitations of sampling for aggregation queries
    Chaudhuri, S
    Das, G
    Datar, M
    Motwani, R
    Narasayya, V
    [J]. 17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, : 534 - 542
  • [6] Chaudhuri S, 2001, SIGMOD REC, V30, P295, DOI 10.1145/376284.375694
  • [7] Revisiting Reuse for Approximate Query Processing
    Galakatos, Alex
    Crotty, Andrew
    Zgraggen, Emanuel
    Binnig, Carsten
    Kraska, Tim
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (10): : 1142 - 1153
  • [8] Ganti V., 2000, VLDB.
  • [9] Gibbons P. B., 1998, SIGMOD Record, V27, P331, DOI 10.1145/276305.276334
  • [10] Data cube: A relational aggregation operator generalizing group-by, cross-tab, and sub-totals
    Gray, J
    Chaudhuri, S
    Bosworth, A
    Layman, A
    Reichart, D
    Venkatrao, M
    Pellow, F
    Pirahesh, H
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1997, 1 (01) : 29 - 53