A deep multi-agent reinforcement learning approach for the micro-service migration problem with affinity in the cloud

被引:0
作者
Ma, Ning [1 ]
Tang, Angjun [1 ]
Xiong, Zifeng [2 ]
Jiang, Fuxin [3 ]
机构
[1] Xi An Jiao Tong Univ, Sch Publ Policy & Adm, Xian 710049, Peoples R China
[2] Xi An Jiao Tong Univ, Sch Software Engn, Xian 710049, Peoples R China
[3] ByteDance Inc, Beijing 100098, Peoples R China
关键词
Micro-service; Migration; Invoking traffic; Resource optimization; Deep reinforcement learning; VIRTUAL MACHINES; LIVE MIGRATION;
D O I
10.1016/j.eswa.2025.126856
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper focuses on the micro-service migration problem with affinity, stemming from the cloud computing industry. Because of periodically creating and deleting micro-services to satisfy users' demands, the deployment of micro-services in the cloud needs to be regularly adjusted, which is referred to as a micro-service migration. An optimal migration schedule should minimize the number of activated physical machines as well as maximize total internal invoking traffic (affinity). A cooperative multi-agent reinforcement learning (MARL) is proposed, which is enhanced by integrating Hindsight Reward Shaping and by fine-tuning the state encoder using a pre-trained ResNet model. The proposed MARL is validated on both synthetic datasets and real cloud traces of ByteDance and Alibaba, compared with four baseline algorithms: Migration Ant Colony Optimization, Migration Neighborhood Search, Single-Agent Reinforcement Learning, and the optimization solver CPLEX. Finally, an evaluation mechanism called Matching Score is proposed to explain the superior performance of MARL.
引用
收藏
页数:19
相关论文
共 45 条
[1]  
Badraa T, 2018, IEEE INT CONF CL NET
[2]   Learn-as-you-go with Megh: Efficient Live Migration of Virtual Machines [J].
Basu, Debabrota ;
Wang, Xiayang ;
Hong, Yang ;
Chen, Haibo ;
Bressan, Stephane .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (08) :1786-1801
[3]   A machine learning model for improving virtual machine migration in cloud computing [J].
Belgacem, Ali ;
Mahmoudi, Said ;
Ferrag, Mohamed Amine .
JOURNAL OF SUPERCOMPUTING, 2023, 79 (09) :9486-9508
[4]   Optimizing Live Migration of Multiple virtual Machines [J].
Cerroni, Walter ;
Esposito, Flavio .
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2018, 6 (04) :1096-1109
[5]   Secure Virtual Machine Migration using Ant Colony Algorithm [J].
Charles, P. Joseph ;
Stanislaus, U. Lawrence .
PROCEEDINGS OF THE 2021 FIFTH INTERNATIONAL CONFERENCE ON I-SMAC (IOT IN SOCIAL, MOBILE, ANALYTICS AND CLOUD) (I-SMAC 2021), 2021, :1571-1575
[6]   Dynamic Service Migration and Request Routing for Microservice in Multicell Mobile-Edge Computing [J].
Chen, Xiangyi ;
Bi, Yuanguo ;
Chen, Xueping ;
Zhao, Hai ;
Cheng, Nan ;
Li, Fuliang ;
Cheng, Wenlin .
IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (15) :13126-13143
[7]  
Chen Zuzhi, 2024, 2024 IEEE 40th International Conference on Data Engineering (ICDE), P5280, DOI 10.1109/ICDE60146.2024.00397
[8]   An Efficient Container Management Scheme for Resource-Constrained Intelligent IoT Devices [J].
Chhikara, Prateek ;
Tekchandani, Rajkumar ;
Kumar, Neeraj ;
Obaidat, Mohammad S. .
IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (16) :12597-12609
[9]  
Dragoni N., 2017, PRESENT ULTERIOR SOF, P195
[10]   Dynamic Event-Triggered Consensus Control for Interval Type-2 Fuzzy Multi-Agent Systems [J].
Du, Zhenbin ;
Xie, Xiangpeng ;
Qu, Zifang ;
Hu, Yangyang ;
Stojanovic, Vladimir .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2024, 71 (08) :3857-3866