A coordinated rendezvous method for unmanned surface vehicle swarms based on multi-agent reinforcement learning

被引:0
作者
Xia J. [1 ,2 ]
Liu Z. [1 ]
Zhu X. [3 ]
Liu Z. [1 ]
机构
[1] School of Weaponry Engineering, Naval University of Engineering, Wuhan
[2] Qingdao campus, Naval Aviation University, Qingdao
[3] School of Electronic Engineering, Naval University of Engineering, Wuhan
来源
Beijing Hangkong Hangtian Daxue Xuebao/Journal of Beijing University of Aeronautics and Astronautics | 2023年 / 49卷 / 12期
基金
中国博士后科学基金;
关键词
deep reinforcement learning; multi-agent reinforcement learning; proximal policy optimization; rendezvous method; swarm system; unmanned surface vehicles;
D O I
10.13700/j.bh.1001-5965.2022.0088
中图分类号
学科分类号
摘要
To address the challenge of rendezvousing an indeterminate number of homogeneous unmanned surface vehicles (USV) into desired formations, a distributed rendezvousing control method is introduced, leveraging multi-agent reinforcement learning (MARL). Recognizing the communication and perception constraints inherent to USVs, a dynamic interaction graph for the swarm is crafted. By adopting a two-dimensional grid encoding methodology, a consistent-dimensional observation space for each agent is generated. Within the multi-agent proximal policy optimization (MAPPO) framework, which incorporates centralized training and distributed execution, the state and action spaces for both the policy and value networks are distinctly designed, and a reward function is articulated. Upon the construction of a simulated environment for USV swarm rendezvous, it is highlighted in our results that the method achieves effective convergence post-training. In scenarios encompassing varying desired formations, differing swarm sizes, and partial agent failures, swift rendezvous is consistently ensured by proposed method, underlining its flexibility and robustness. © 2023 Beijing University of Aeronautics and Astronautics (BUAA). All rights reserved.
引用
收藏
页码:3365 / 3376
页数:11
相关论文
共 28 条
[1]  
WANG S, ZHANG J Q, YANG S H, Et al., Research on development status and combat applications of USVs in worldwide, Fire Control & Command Control, 44, 2, pp. 11-15, (2019)
[2]  
LI W, LI T W., Military application and intelligent upgrade of unmanned boat technology in various countries, Aerodynamic Missile Journal, 10, pp. 60-62, (2020)
[3]  
WANG B H, WU T Y, LI W H, Et al., Large-scale UAVs confrontation based on multi-agent reinforcement learningrevoke, Journal of System Simulation, 33, 8, pp. 1739-1753, (2021)
[4]  
Unmanned systems integrated roadmap 2017-2042
[5]  
TAN K H, LEWIS M A., Virtual structures for high-precision cooperative mobile robotic control, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 132-139, (2002)
[6]  
KUPPAN CHETTY R M, SINGAPERUMAL M, NAGARAJAN T., Behavior based multi robot formations with active obstacle avoidance based on switching control strategy, Advanced Materials Research, 433-440, pp. 6630-6635, (2012)
[7]  
HE L L, LOU X C., Study on the formation control methods for multi-agent based on geometric characteristics, Advanced Materials Research, 765-767, pp. 1928-1931, (2013)
[8]  
XU D D, ZHANG X N, ZHU Z Q, Et al., Behavior-based formation control of swarm robots, Mathematical Problems in Engineering, 2014, pp. 1-13, (2014)
[9]  
XU L, CHEN Y, GUI Z F, Et al., Research on rendezvous control of unmanned vessels based on finite-time synchronization, Journal of Sichuan Ordnance, 36, 10, pp. 154-160, (2015)
[10]  
CHEN Y, YE Q, ZHOU D W, Et al., On rendezvous control model of unmanned vessels, Journal of Naval University of Engineering, 28, 6, pp. 23-27, (2016)