Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction

被引:21
作者
Hilprecht, Benjamin [1 ]
Binnig, Carsten [1 ,2 ]
机构
[1] Tech Univ Darmstadt, Darmstadt, Germany
[2] DFKI, Kaiserslautern, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2022年 / 15卷 / 11期
关键词
QUERIES;
D O I
10.14778/3551793.3551799
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we introduce zero-shot cost models, which enable learned cost estimation that generalizes to unseen databases. In contrast to state-of-the-art workload-driven approaches, which require to execute a large set of training queries on every new database, zero-shot cost models thus allow to instantiate a learned cost model out-of-the-box without expensive training data collection. To enable such zero-shot cost models, we suggest a new learning paradigm based on pre-trained cost models. As core contributions to support the transfer of such a pre-trained cost model to unseen databases, we introduce a new model architecture and representation technique for encoding query workloads as input to those models. As we will show in our evaluation, zero-shot cost estimation can provide more accurate cost estimates than state-of-the-art models for a wide range of (real-world) databases without requiring any query executions on unseen databases. Furthermore, we show that zero-shot cost models can be used in a few-shot mode that further improves their quality by retraining them just with a small number of additional training queries on the unseen database.
引用
收藏
页码:2361 / 2374
页数:14
相关论文
共 37 条
[1]   Learning-based Query Performance Modeling and Prediction [J].
Akdere, Mert ;
Cetintemel, Ugur ;
Riondato, Matteo ;
Upfal, Eli ;
Zdonik, Stanley B. .
2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, :390-401
[2]  
Brown TB, 2020, ADV NEUR IN, V33
[3]  
Ferguson AD., 2012, P 7 ACM EUR C COMP S, P99, DOI [10.1145/2168836.2168847, DOI 10.1145/2168836.2168847]
[4]   Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning [J].
Ganapathi, Archana ;
Kuno, Harumi ;
Dayal, Umeshwar ;
Wiener, Janet L. ;
Fox, Armando ;
Jordan, Michael ;
Patterson, David .
ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, :592-+
[5]  
Gilmer J, 2017, PR MACH LEARN RES, V70
[6]   An Autonomous Materialized View Management System with Deep Reinforcement Learning [J].
Han, Yue ;
Li, Guoliang ;
Yuan, Haitao ;
Sun, Ji .
2021 IEEE 37TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2021), 2021, :2159-2164
[7]   Learning a Partitioning Advisor for Cloud Databases [J].
Hilprecht, Benjamin ;
Binnig, Carsten ;
Rohm, Uwe .
SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, :143-157
[8]   DeepDB: Learn from Data, not from Queries! [J].
Hilprecht, Benjamin ;
Schmidt, Andreas ;
Kulessa, Moritz ;
Molina, Alejandro ;
Kersting, Kristian ;
Binnig, Carsten .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 13 (07) :992-1005
[9]  
Hilprecht Benjamin, 2020, 10 C INNOVATIVE DATA
[10]   Efficient Deep Learning Pipelines for Accurate Cost Estimations Over Large Scale Query Workload [J].
Kang, Johan Kok Zhi ;
Gaurav ;
Tan, Sien Yi ;
Cheng, Feng ;
Sun, Shixuan ;
He, Bingsheng .
SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, :1014-1022