Multi-Agent Reinforcement Learning for Dynamic Topology Optimization of Mesh Wireless Networks

被引:1
作者
Sun, Wei [1 ,2 ]
Lv, Qiushuo [1 ,2 ]
Xiao, Yang [3 ]
Liu, Zhi [4 ]
Tang, Qingwei [1 ,2 ]
Li, Qiyue [1 ,2 ]
Mu, Daoming [1 ,2 ]
机构
[1] Hefei Univ Technol, Sch Elect & Automat Engn, Hefei 230009, Anhui, Peoples R China
[2] Anhui Engn Technol Res Ctr Ind Automat, Hefei 230009, Peoples R China
[3] Univ Alabama, Dept Comp Sci, Tuscaloosa, AL 35487 USA
[4] Univ Electrocommun, Dept Comp & Network Engn, Tokyo 1828585, Japan
基金
中国国家自然科学基金;
关键词
Delays; Trajectory; Topology; Network topology; Vectors; Wireless networks; Logic gates; Actor-critic; mesh wireless network; reinforcement learning; topology optimization; ad hoc wireless network; IEEE-802.11; SCHEME;
D O I
10.1109/TWC.2024.3372694
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In Mesh Wireless Networks (MWNs), the network coverage is extended by connecting Access Points (APs) in a mesh topology, where transmitting frames by multi-hop routing has to sustain the performances, such as end-to-end (E2E) delay and channel efficiency. Several recent studies have focused on minimizing E2E delay, but these methods are unable to adapt to the dynamic nature of MWNs. Meanwhile, reinforcement-learning-based methods offer better adaptability to dynamics but suffer from the problem of high-dimensional action spaces, leading to slower convergence. In this paper, we propose a multi-agent actor-critic reinforcement learning (MACRL) algorithm to optimize multiple objectives, specifically the minimization of E2E delay and the enhancement of channel efficiency. First, to reduce the action space and speed up the convergence in the dynamical optimization process, a centralized-critic-distributed-actor scheme is proposed. Then, a multi-objective reward balancing method is designed to dynamically balance the MWNs' performances between the E2E delay and the channel efficiency. Finally, the trained MACRL algorithm is deployed in the QaulNet simulator to verify its effectiveness.
引用
收藏
页码:10501 / 10513
页数:13
相关论文
共 30 条
[1]  
Amiripalli Shanmuk Srinivas, 2019, Cognitive Informatics and Soft Computing. Proceeding of CISC 2017. Advances in Intelligent Systems and Computing (AISC 768), P75, DOI 10.1007/978-981-13-0617-4_8
[2]   Performance analysis,of the IEEE 802.11 distributed coordination function [J].
Bianchi, G .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2000, 18 (03) :535-547
[3]  
Busoniu L, 2010, STUD COMPUT INTELL, V310, P183
[4]   Performance analysis of IEEE 802.11 DCF in presence of transmission errors [J].
Chatzimisios, P ;
Boucouvalas, AC ;
Vitsas, V .
2004 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-7, 2004, :3854-3858
[5]   Wireless-Powered Sensor Networks: How to Realize [J].
Choi, Kae Won ;
Ginting, Lorenz ;
Rosyady, Phisca Aditya ;
Aziz, Arif Abdul ;
Kim, Dong In .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2017, 16 (01) :221-234
[6]  
Cooper Robert B, 1981, P ACM C, P119
[7]   Phasor Alternatives to Friis' Transmission Equation [J].
Franek, Ondrej .
IEEE ANTENNAS AND WIRELESS PROPAGATION LETTERS, 2018, 17 (01) :90-93
[8]   Designing an adaptive production control system using reinforcement learning [J].
Kuhnle, Andreas ;
Kaiser, Jan-Philipp ;
Theiss, Felix ;
Stricker, Nicole ;
Lanza, Gisela .
JOURNAL OF INTELLIGENT MANUFACTURING, 2021, 32 (03) :855-876
[9]   Two-Stage Semi-Distributed Resource Management for Device-to-Device Communication in Cellular Networks [J].
Lee, Dong Heon ;
Choi, Kae Won ;
Jeon, Wha Sook ;
Jeong, Dong Geun .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2014, 13 (04) :1908-1920
[10]   Topology optimization of particle swarm optimization [J].
1600, Springer Verlag (8794) :142-149