A Multi-agent Deep Reinforcement Learning Method for UAVs Cooperative Pursuit Problem

被引:0
作者
Yang, Feng [1 ,2 ]
Shao, Changshun [1 ,2 ]
Shen, Baoyin [1 ,3 ]
Li, Zhi [1 ,2 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian 710129, Peoples R China
[2] Minist Educ, Key Lab Informat Fus Technol, Xian 710129, Peoples R China
[3] China Elect Technology Grp Corp, Res Inst 14, Nanjing 210013, Peoples R China
来源
ADVANCES IN GUIDANCE, NAVIGATION AND CONTROL | 2023年 / 845卷
关键词
UAV; Multi-agent system; Deep reinforcement learning;
D O I
10.1007/978-981-19-6613-2_699
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an important form of intelligent warfare, UAV swarm is emerging. This paper designs a solution for UAV cooperative pursuit scenarios based on MADDPG. The clipping double Q network and policy delay update mechanism are proposed to solve the problems of overestimation of value function and wrong transmission in MADDPG algorithm. Due to the idea of centralized training and distributed execution of MADDPG algorithm and the architecture of constructing evaluation function for each agent, the method in this paper has good scalability and can be effectively applied to the environment of cooperative pursuit task.
引用
收藏
页码:7243 / 7252
页数:10
相关论文
共 18 条
[1]  
[Anonymous], 2010, Advances in Neural Information Processing Systems
[2]  
Fujimoto S, 2018, PR MACH LEARN RES, V80
[3]  
Hao C., 2020, 8 CHIN COMM CONTR C, P454
[4]  
Kakade S, 2002, ADV NEUR IN, V14, P1531
[5]  
Konda VR, 2000, ADV NEUR IN, V12, P1008
[6]   无人集群试验评估研究现状分析及理论方法 [J].
梁晓龙 ;
侯岳奇 ;
胡利平 ;
张佳强 ;
祝捷 .
南京航空航天大学学报, 2020, 52 (06) :846-854
[7]  
Mnih V, 2013, Arxiv, DOI [arXiv:1312.5602, 10.48550/arXiv.1312.5602, DOI 10.48550/ARXIV.1312.5602]
[8]   Human-level control through deep reinforcement learning [J].
Mnih, Volodymyr ;
Kavukcuoglu, Koray ;
Silver, David ;
Rusu, Andrei A. ;
Veness, Joel ;
Bellemare, Marc G. ;
Graves, Alex ;
Riedmiller, Martin ;
Fidjeland, Andreas K. ;
Ostrovski, Georg ;
Petersen, Stig ;
Beattie, Charles ;
Sadik, Amir ;
Antonoglou, Ioannis ;
King, Helen ;
Kumaran, Dharshan ;
Wierstra, Daan ;
Legg, Shane ;
Hassabis, Demis .
NATURE, 2015, 518 (7540) :529-533
[9]  
Peters J, 2005, LECT NOTES ARTIF INT, V3720, P280, DOI 10.1007/11564096_29
[10]  
Shuzhe X., 2021, Radio Eng., V51, P360