Dynamic Sizing of Cloud-Native Telco Data Centers With Digital Twin and Reinforcement Learning

被引:0
作者
Pentelas, Angelos [1 ]
Katsiros, Dimitris [2 ]
Paranou, Dimitra [2 ]
Doukas, George [1 ]
Chondralis, Konstantinos [1 ]
Giannopoulos, Giorgos [2 ]
Angelou, Evangelos [1 ]
Papastefanatos, George [2 ]
机构
[1] Intracom SA Telecom Solut, Paiania 19002, Greece
[2] Athena Res Ctr, Maroussi 15125, Greece
关键词
Decision making; Filtering; Data centers; Network function virtualization; Resource management; Reinforcement learning; Prediction algorithms; Data center management; resource allocation; network function virtualization; reinforcement learning; digital twin; NETWORK;
D O I
10.1109/ACCESS.2024.3421289
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Telco edge data centers (DCs) accommodate applications whose load fluctuates considerably during a day. This variability mandates swift and responsive resource adjustments to mitigate the risk of incurring unnecessary costs during off-peak periods, where a significant fraction of nodes may be under-utilized. Tackling this challenge is integral to optimizing operational efficiency and cost-effectiveness of telco edge DCs. To this end, this article aims to address the Dynamic Data Center Sizing (DDS) problem, which boils down to optimizing the number of active nodes as per current resource demand. The proposed DDS solution consists of two core modules, namely a forecasting module, which predicts resource demands, and a decision-making module, which acts upon predicted demands. The decision-making of DDS is implemented via the filtering and the rank-drain-observe (RDO) algorithms. Filtering is based on integer linear programming, and it computes the theoretically optimal state of the DC based on predicted resource demands. RDO is a heuristic that strives to realize the optimal DC state in an iterative and robust fashion. To expedite RDO in large-scale clusters, we further devise a Reinforcement Learning (RL)-enabled DDS variant (i.e., RL-DDS), in which RDO integrates an RL agent that computes optimized batches of nodes for concurrent deactivation. We propose an innovative solution based on the notion of Digital Twin to train the RL agent in emulation mode. DDS and RL-DDS are evaluated upon real-life testbeds resembling actual DCs. Results demonstrate significant cost reduction, ranging from 7% in conservative and relatively static scenarios, up to 38% in highly dynamic settings.
引用
收藏
页码:91462 / 91479
页数:18
相关论文
共 29 条
[1]  
[Anonymous], Kubernetes Scheduler
[2]  
[Anonymous], Google Cluster Data
[3]  
[Anonymous], 2012, P SDN OPENFLOW WORLD
[4]  
[Anonymous], about us
[5]   Adaptive Prediction Models for Data Center Resources Utilization Estimation [J].
Baig, Shuja-ur-Rehman ;
Iqbal, Waheed ;
Berral, Josep Lluis ;
Erradi, Abdelkarim ;
Carrera, David .
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2019, 16 (04) :1681-1693
[6]   Energy-aware resource allocation heuristics for efficient management of data centers for Cloud computing [J].
Beloglazov, Anton ;
Abawajy, Jemal ;
Buyya, Rajkumar .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2012, 28 (05) :755-768
[7]   AI-Driven Zero Touch Network and Service Management in 5G and Beyond: Challenges and Research Directions [J].
Benzaid, Chafika ;
Taleb, Tarik .
IEEE NETWORK, 2020, 34 (02) :186-194
[8]   Accurate workload prediction for edge data centers: Savitzky-Golay filter, CNN and BiLSTM with attention mechanism [J].
Chen, Lei ;
Zhang, Weiwen ;
Ye, Haiming .
APPLIED INTELLIGENCE, 2022, 52 (11) :13027-13042
[9]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[10]   Energy-Efficient Resource Allocation and Provisioning Framework for Cloud Data Centers [J].
Dabbagh, Mehiar ;
Hamdaoui, Bechir ;
Guizani, Mohsen ;
Rayes, Ammar .
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2015, 12 (03) :377-391