Eigen: End-to-end Resource Optimization for Large-Scale Databases on the Cloud

被引:6
作者
Li, Ji You [1 ]
Zhang, Jiachi [1 ]
Zhou, Wenchao [1 ]
Liu, Yuhang [1 ]
Zhang, Shuai [1 ]
Xue, Zhuoming [1 ]
Xu, Ding [1 ]
Fan, Hua [1 ]
Zhou, Fangyuan [1 ]
Li, Feifei [1 ]
机构
[1] Alibaba Grp, Hangzhou, Peoples R China
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2023年 / 16卷 / 12期
关键词
D O I
10.14778/3611540.3611565
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Increasingly, cloud database vendors host large-scale geographically distributed clusters to provide cloud database services. When managing the clusters, we observe that it is challenging to simultaneously maximizing the resource allocation ratio and resource availability. This problem becomes more severe in modern cloud database clusters, where resource allocations occur more frequently and on a greater scale. To improve the resource allocation ratio without hurting resource availability, we introduce Eigen, a large-scale cloud-native cluster management system for large-scale databases on the cloud. Based on a resource flow model, we propose a hierarchical resource management system and three resource optimization algorithms that enable end-to-end resource optimization. Furthermore, we demonstrate the system optimization that promotes user experience by reducing scheduling latencies and improving scheduling throughput. Eigen has been launched in a large-scale public-cloud production environment for 30+ months and served more than 30+ regions (100+ available zones) globally. Based on the evaluation of real-world clusters and simulated experiments, Eigen can improve the allocation ratio by over 27% (from 60% to 87.0%) on average, while the ratio of delayed resource provisions is under 0.1%.
引用
收藏
页码:3795 / 3807
页数:13
相关论文
共 32 条
[1]  
Alibaba Cloud, Alibaba Cloud RDS MySQL Serverless
[2]   An opportunity cost approach for job assignment in a scalable computing cluster [J].
Amir, Y ;
Awerbuch, B ;
Barak, A ;
Borgstrom, RS ;
Keren, A .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2000, 11 (07) :760-768
[3]   Base stock policy with retrial demands [J].
Anbazhagan, N. ;
Wang, Jinting ;
Gomathi, D. .
APPLIED MATHEMATICAL MODELLING, 2013, 37 (06) :4464-4473
[4]  
AWS, Amazon Aurora Serverless
[5]  
Azure, Azure SQL Serverless
[6]  
Bansal N., 2016, P 27 ANN ACM SIAM S, P1561
[7]   Probabilistic forecasting with temporal convolutional neural network [J].
Chen, Yitian ;
Kang, Yanfei ;
Chen, Yixiong ;
Wang, Zizhuo .
NEUROCOMPUTING, 2020, 399 :491-501
[8]   Approximation and online algorithms for multidimensional bin packing: A survey [J].
Christensen, Henrik I. ;
Khan, Arindam ;
Pokutta, Sebastian ;
Tetali, Prasad .
COMPUTER SCIENCE REVIEW, 2017, 24 :63-79
[9]  
etcd. etcd, About us
[10]  
Hindman Benjamin, 2011, 8 US S NETW SYST DES