AlphaQO: Robust Learned Query Optimizer

被引:0
作者
Yu X. [1 ]
Chai C.-L. [1 ]
Zhang X.-N. [2 ]
Tang N. [3 ]
Sun J. [1 ]
Li G.-L. [1 ]
机构
[1] Department of Computer Science and Technology, Tsinghua University, Beijing
[2] College of Computer Science and Technology, Zhejiang University, Hangzhou
[3] Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha
来源
Ruan Jian Xue Bao/Journal of Software | 2022年 / 33卷 / 03期
关键词
AI4DB; Database; Learned optimizer; Query generation; Reinforcement learning; Robustness;
D O I
10.13328/j.cnki.jos.006452
中图分类号
学科分类号
摘要
Learned database query optimizers, which are typically empowered by (deep) learning models, have attracted significant attention recently, because they can offer similar or even better performance than the state-of-the-art commercial optimizers that require hundreds of expert-hours to tune. A crucial factor of successfully training learned optimizers is training queries. Unfortunately, a good query workload that is sufficient for training learned optimizers is not always available. This study proposes a framework, called AlphaQO, on generating queries for learned optimizers with reinforcement learning (RL). AlphaQO is a loop system that consists of two main components, query generator and learned optimizer. Query generator aims at generating “hard” queries (i.e., those queries that the learned optimizer provides poor estimates). The learned optimizer will be trained using generated queries, as well as providing feedbacks (in terms of numerical rewards) to the query generator. If the generated queries are good, the query generator will get a high reward; otherwise, the query generator will get a low reward. The above process is performed iteratively, with the main goal that within a small budget, the learned optimizer can be trained and generalized well to a wide range of unseen queries. Extensive experiments show that AlphaQO can generate a relatively small number of queries and train a learned optimizer to outperform commercial optimizers. Moreover, learned optimizers need much less queries from AlphaQO than randomly generated queries, in order to well train the learned optimizer. © Copyright 2022, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
引用
收藏
页码:814 / 831
页数:17
相关论文
共 34 条
[1]  
Selinger PG, Astrahan MM, Chamberlin DD, Lorie RA, Price TG., Access path selection in a relational database management system, Proc. of the Readings in Artificial Intelligence and Databases, pp. 511-522, (1989)
[2]  
Babcock B, Chaudhuri S., Towards a robust query optimizer: A principled and practical approach, Proc. of the 2005 ACM SIGMOD Int’l Conf. on Management of Data, pp. 119-130, (2005)
[3]  
Trummer I, Koch C., Solving the join ordering problem via mixed integer linear programming, Proc. of the 2017 ACM Int’l Conf. on Management of Data, pp. 1025-1040, (2017)
[4]  
Ioannidis YE, Kang YC., Left-deep vs. bushy trees: An analysis of strategy spaces and its implications for query optimization, Proc. of the ‘91 ACM SIGMOD Int’l Conf. on Management of Data, pp. 168-177, (1991)
[5]  
Chande SV, Sinha M., Genetic optimization for the join ordering problem of database queries, Proc. of the 2011 Annual IEEE India Conf. IEEE, pp. 1-5, (2011)
[6]  
Waas F, Pellenkoft A., Join order selection (good enough is easy), Proc. of the British National Conf. on Databases, pp. 51-67, (2000)
[7]  
Fegaras L., A new heuristic for optimizing large queries, Proc. of the Int’l Conf. on Database and Expert Systems Applications, pp. 726-735, (1998)
[8]  
Marcus R, Negi P, Mao HZ, Zhang C, Alizadeh M, Kraska T, Papaemmanouil O, Tatbul N., Neo: A learned query optimizer, (2019)
[9]  
Krishnan S, Yang ZH, Goldberg K, Hellerstein J, Stoica I., Learning to optimize join queries with deep reinforcement learning, (2018)
[10]  
Marcus R, Papaemmanouil O., Deep reinforcement learning for join order enumeration, Proc. of the 1st Int’l Workshop on Exploiting Artificial Intelligence Techniques for Data Management, pp. 1-4, (2018)