Sample-Efficient Cardinality Estimation Using Geometric Deep Learning

被引:4
作者
Reiner, Silvan [1 ]
Grossniklaus, Michael [1 ]
机构
[1] Univ Konstanz, Constance, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2023年 / 17卷 / 04期
关键词
QUERY; OPTIMIZER;
D O I
10.14778/3636218.3636229
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In database systems, accurate cardinality estimation is a cornerstone of effective query optimization. In this context, estimators that use machine learning have shown significant promise. Despite their potential, the effectiveness of these learned estimators strongly depends on their ability to learn from small training sets. This paper presents a novel approach for learned cardinality estimation that addresses this issue by enhancing sample efficiency. We propose a neural network architecture informed by geometric deep learning principles that represents queries as join graphs. Furthermore, we introduce an innovative encoding for complex predicates, treating their encoding as a feature selection problem. Additionally, we devise a regularization term that employs equalities of the relational algebra and three-valued logic, augmenting the training process without requiring additional ground truth cardinalities. We rigorously evaluate our model across multiple benchmarks, examining q-errors, runtimes, and the impact of workload distribution shifts. Our results demonstrate that our model significantly improves the end-to-end runtimes of PostgreSQL, even with cardinalities gathered from as little as 100 query executions.
引用
收藏
页码:740 / 752
页数:13
相关论文
共 45 条
[1]   LOGER: A Learned Optimizer towards Generating Efficient and Robust Query Execution Plans [J].
Chen, Tianyi ;
Gao, Jun ;
Chen, Hedui ;
Tu, Yaofeng .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (07) :1777-1789
[2]   DSB: A Decision Support Benchmark for Workload-Driven and Traditional Database Systems [J].
Ding, Bailu ;
Chaudhuri, Surajit ;
Gehrke, Johannes ;
Narasayya, Vivek .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (13) :3376-3388
[3]   Selectivity Estimation for Range Predicates using Lightweight Models [J].
Dutt, Anshuman ;
Wang, Chi ;
Nazi, Azade ;
Kandula, Srikanth ;
Narasayya, Vivek ;
Chaudhuri, Surajit .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2019, 12 (09) :1044-1057
[4]   Efficient greedy feature selection for unsupervised learning [J].
Farahat, Ahmed K. ;
Ghodsi, Ali ;
Kamel, Mohamed S. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 35 (02) :285-310
[5]  
Fey M, 2019, Arxiv, DOI [arXiv:1903.02428, DOI 10.48550/ARXIV.1903.02428]
[6]  
Gilmer J, 2017, PR MACH LEARN RES, V70
[7]  
Hamilton WL, 2017, ADV NEUR IN, V30
[8]  
Han Y., 2021, VLDB, V15, P752, DOI 10.14778/3503585.3503586
[9]  
He Haoyu, 2022, WORKSHOP FORMAL VERI
[10]   Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction [J].
Hilprecht, Benjamin ;
Binnig, Carsten .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2022, 15 (11) :2361-2374