Multi-objective optimization for submarine optical fiber cable route planning based on collaborative reinforcement learning with LPGP framework

被引:0
作者
Zhao, Zanshan [1 ,2 ]
Gao, Guanjun [1 ]
Gan, Weiming [2 ]
Zhang, Jialiang [1 ]
Wang, Zengfu [3 ]
Wang, Haoyu [1 ]
Guo, Yonggang [4 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Informat Photon & Opt Commun, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Acoust, Hainan Acoust Lab, Haikou, Peoples R China
[3] Northwestern Polytech Univ, Sch Automat, Xian, Peoples R China
[4] Chinese Acad Sci, Inst Acoust, Beijing, Peoples R China
基金
海南省自然科学基金; 中国国家自然科学基金;
关键词
Submarine cable route planning; Multi-objective optimization; Local Pareto to global Pareto; Collaborative reinforcement learning;
D O I
10.1016/j.yofte.2025.104178
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Submarine optical fiber cables are essential to international communication, transmitting approximately 99% of global traffic. The cost and survivability of these cables are key factors that must be carefully considered during the design stage. However, the cost of submarine cables and the risks closely associated with their survivability have not yet been decoupled and simultaneously optimized. In this paper, we propose a Local Pareto to Global Pareto (LPGP) paradigm for multi-objective optimization. Based on this paradigm, we design an offline collaborative reinforcement learning LPGP (Off-CRL-LPGP) framework that effectively decouples and simultaneously optimizes the cost and risk of submarine optical fiber cable routing. The results demonstrate that Off-CRL-LPGP reduces accumulated costs by 28.83% compared to ant colony optimization (ACO) under the same risk conditions, while requiring significantly less computational time. Compared to multi-agent cross reinforcement learning (MA-XRL), under the same accumulated risk and accumulated cost conditions, the Off-CRL-LPGP could respectively reduce accumulated cost by 3% and accumulated risk by 1.1%, at the expense of some additional computational time. Compared to online summation for global Pareto (On-Sum-GP), Off-CRL-LPGP could respectively reduce accumulated cost by 7.8% and risk by 23.48%. We also investigate the impact of algorithm combinations on the performance of Off-CRL-LPGP. Alternating Q-learning and SARSA (alternate Q-S/S-Q) could reduce accumulated costs by 3.83% and risk by 13.78%, while improving the convergence level for cost by 2.18 times and for risk by 3.30 times. Furthermore, data smoothing method proposed in this work reduces accumulated cost and risk by 4.58% and 6.17%, respectively, and improves stability in 97.66% of iterations, with a maximum stability improvement of 6.64 times.
引用
收藏
页数:13
相关论文
共 25 条
[1]  
Agence France Presse, Volcano damage to Tonga undersea cable worse than expected
[2]  
[Anonymous], 1980, The analytic hierarchy process: Planning, priority setting, resource allocation, DOI DOI 10.3414/ME10-01-0028
[3]  
Arthur C., Undersea internet cables off Egypt disrupted as navy arrests three
[4]   Efficient curvature-constrained least cost route optimization on parallel architectures [J].
Blaise, Sebastien ;
Spinewine, Benoit .
ENGINEERING WITH COMPUTERS, 2022, 38 (SUPPL 3) :2041-2057
[5]  
gebco.net, About us
[6]   Supervised Neural Q_learning based Motion Control for Bionic Underwater Robots [J].
Lin, Longxin ;
Xie, Haibin ;
Zhang, Daibing ;
Shen, Lincheng .
JOURNAL OF BIONIC ENGINEERING, 2010, 7 :S177-S184
[7]  
Mauldin A., 2024, Cable Breakage: When and How Cables Go Down Online
[8]  
McCurdy A., 2024, Submarine Telecoms industry report Online
[9]  
Tran PN, 2016, IEEE COMMUN MAG, V54, P131, DOI 10.1109/MCOM.2016.7509391
[10]  
Qiu W., 2011, Submarine Cables Cut after Taiwan Earthquake in Dec 2006