Learning-Based Sample Tuning for Approximate Query Processing in Interactive Data Exploration

被引:0
作者
Zhang, Hanbing [1 ]
Jing, Yinan [1 ]
He, Zhenying [1 ]
Zhang, Kai [1 ]
Wang, X. Sean [1 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai 200437, Peoples R China
基金
中国国家自然科学基金;
关键词
Measurement; Adaptation models; Costs; Tuners; Accuracy; Q-learning; Query processing; Optimization; Synthetic data; Approximate query processing; interactive data exploration; data analysis;
D O I
10.1109/TKDE.2023.3341451
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For interactive data exploration, approximate query processing (AQP) is a useful approach that usually uses samples to provide a timely response for queries by trading query accuracy. Existing AQP systems often materialize samples in the memory for reuse to speed up query processing. How to tune the samples according to the workload is one of the key problems in AQP. However, since the data exploration workload is so complex that it cannot be accurately predicted, existing sample tuning approaches cannot adapt to the changing workload very well. To address this problem, this paper proposes a deep reinforcement learning-based sample tuner, RL-STuner. When tuning samples, RL-STuner considers the workload changes from a global perspective and uses a Deep Q-learning Network (DQN) model to select an optimal sample set that has the maximum utility for the current workload. In addition, this paper proposes a set of optimization mechanisms to reduce the sample tuning cost. Experimental results on both real-world and synthetic datasets show that RL-STuner outperforms the existing sample tuning approaches and achieves 1.6x-5.2x improvements on query accuracy with a low tuning cost.
引用
收藏
页码:6532 / 6546
页数:15
相关论文
共 50 条
  • [31] Exploration and Evaluation of Machine Learning-Based Models for Predicting Enzymatic Reactions
    Watanabe, Naoki
    Murata, Masahiro
    Ogawa, Teppei
    Vavricka, Christopher J.
    Kondo, Akihiko
    Ogino, Chiaki
    Araki, Michihiro
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (03) : 1833 - 1843
  • [32] Machine Learning-Based Boosted Regression Ensemble Combined with Hyperparameter Tuning for Optimal Adaptive Learning
    Isabona, Joseph
    Imoize, Agbotiname Lucky
    Kim, Yongsung
    SENSORS, 2022, 22 (10)
  • [33] Big Data architecture for intelligent maintenance: a focus on query processing and machine learning algorithms
    Claude Lehmann
    Lilach Goren Huber
    Thomas Horisberger
    Georg Scheiba
    Ana Claudia Sima
    Kurt Stockinger
    Journal of Big Data, 7
  • [34] Learning-Based Detection of Harmful Data in Mobile Devices
    Jang, Seok-Woo
    Kim, Gye-Young
    MOBILE INFORMATION SYSTEMS, 2016, 2016
  • [35] Learning nodes: machine learning-based energy and data management strategy
    Kim, Yunmin
    Lee, Tae-Jin
    EURASIP JOURNAL ON WIRELESS COMMUNICATIONS AND NETWORKING, 2021, 2021 (01)
  • [36] Learning-based tuning of supervisory model predictive control for drinking water networks
    Grosso, J. M.
    Ocampo-Martinez, C.
    Puig, V.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (07) : 1741 - 1750
  • [37] Learning nodes: machine learning-based energy and data management strategy
    Yunmin Kim
    Tae-Jin Lee
    EURASIP Journal on Wireless Communications and Networking, 2021
  • [38] Big Data architecture for intelligent maintenance: a focus on query processing and machine learning algorithms
    Lehmann, Claude
    Huber, Lilach Goren
    Horisberger, Thomas
    Scheiba, Georg
    Sima, Ana Claudia
    Stockinger, Kurt
    JOURNAL OF BIG DATA, 2020, 7 (01)
  • [39] A new approximate query engine based on intelligent capture and fast transformations of granulated data summaries
    Slezak, Dominik
    Glick, Rick
    Betlinski, Pawel
    Synak, Piotr
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2018, 50 (02) : 385 - 414
  • [40] A new approximate query engine based on intelligent capture and fast transformations of granulated data summaries
    Dominik Ślęzak
    Rick Glick
    Paweł Betliński
    Piotr Synak
    Journal of Intelligent Information Systems, 2018, 50 : 385 - 414