Learning-Based Sample Tuning for Approximate Query Processing in Interactive Data Exploration

被引:0
作者
Zhang, Hanbing [1 ]
Jing, Yinan [1 ]
He, Zhenying [1 ]
Zhang, Kai [1 ]
Wang, X. Sean [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200437, Peoples R China
基金
中国国家自然科学基金;
关键词
Measurement; Adaptation models; Costs; Tuners; Accuracy; Q-learning; Query processing; Optimization; Synthetic data; Approximate query processing; interactive data exploration; data analysis;
D O I
10.1109/TKDE.2023.3341451
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For interactive data exploration, approximate query processing (AQP) is a useful approach that usually uses samples to provide a timely response for queries by trading query accuracy. Existing AQP systems often materialize samples in the memory for reuse to speed up query processing. How to tune the samples according to the workload is one of the key problems in AQP. However, since the data exploration workload is so complex that it cannot be accurately predicted, existing sample tuning approaches cannot adapt to the changing workload very well. To address this problem, this paper proposes a deep reinforcement learning-based sample tuner, RL-STuner. When tuning samples, RL-STuner considers the workload changes from a global perspective and uses a Deep Q-learning Network (DQN) model to select an optimal sample set that has the maximum utility for the current workload. In addition, this paper proposes a set of optimization mechanisms to reduce the sample tuning cost. Experimental results on both real-world and synthetic datasets show that RL-STuner outperforms the existing sample tuning approaches and achieves 1.6x-5.2x improvements on query accuracy with a low tuning cost.
引用
收藏
页码:6532 / 6546
页数:15
相关论文
共 50 条
  • [1] Learned Optimizer for Online Approximate Query Processing in Data Exploration
    Liu, Liyuan
    Zhang, Hanbing
    Jing, Yinan
    He, Zhenying
    Zhang, Kai
    Wang, X. Sean
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (08) : 3977 - 3991
  • [2] LAQP: Learning-based approximate query processing
    Zhang, Meifan
    Wang, Hongzhi
    INFORMATION SCIENCES, 2021, 546 : 1113 - 1134
  • [3] Learning-Based Optimization for Online Approximate Query Processing
    Bi, Wenyuan
    Zhang, Hanbing
    Jing, Yinan
    He, Zhenying
    Zhang, Kai
    Wang, X. Sean
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2022, PT I, 2022, : 96 - 103
  • [4] Approximate Query Processing Based on Approximate Materialized View
    Wu, Yuhan
    Guo, Haifeng
    Yang, Donghua
    Li, Mengmeng
    Zheng, Bo
    Wang, Hongzhi
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2023, PT II, 2024, 14488 : 168 - 185
  • [5] A Scalable Query Materialization Algorithm for Interactive Data Exploration
    Dhankar, Archana
    Singh, Vikram
    2016 FOURTH INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND GRID COMPUTING (PDGC), 2016, : 128 - 133
  • [6] CS*: Approximate Query Processing on Big Data using Scalable Join Correlated Sample Synopsis
    Yu, Feng
    Hou, Wen-Chi
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 583 - 592
  • [7] DeepSPACE: Approximate Geospatial Query Processing with Deep Learning
    Vorona, Dimitri
    Kipf, Andreas
    Neumann, Thomas
    Kemper, Alfons
    27TH ACM SIGSPATIAL INTERNATIONAL CONFERENCE ON ADVANCES IN GEOGRAPHIC INFORMATION SYSTEMS (ACM SIGSPATIAL GIS 2019), 2019, : 500 - 503
  • [8] Compressed data cube for approximate OLAP query processing
    Feng, Y
    Wang, S
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2002, 17 (05) : 625 - 635
  • [9] Compressed data cube for approximate OLAP query processing
    Yu Feng
    Shan Wang
    Journal of Computer Science and Technology, 2002, 17 : 625 - 635
  • [10] An analysis of query-agnostic sampling for interactive data exploration
    Liu, Wenzhao
    Diao, Yanlei
    Liu, Anna
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2018, 47 (16) : 3820 - 3837