Cooperative control for multi-player pursuit-evasion games with reinforcement learning

被引:94
作者
Wang, Yuanda [1 ,3 ]
Dong, Lu [2 ]
Sun, Changyin [1 ,3 ]
机构
[1] Southeast Univ, Sch Automat, Nanjing 210096, Peoples R China
[2] Tongji Univ, Coll Elect & Informat Engn, Shanghai 201804, Peoples R China
[3] Southeast Univ, Minist Educ, Key Lab Measurement & Control Complex Syst Engn, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Pursuit-evasion game; Reinforcement learning; Distributed control; Communication network; ALGORITHM; SYSTEMS; GO;
D O I
10.1016/j.neucom.2020.06.031
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we consider a pursuit-evasion game in which multiple pursuers attempt to capture one superior evader. A distributed cooperative pursuit strategy with communication is developed based on reinforcement learning. The centralized critic and distributed actor structure and the learning-based communication mechanism are adopted to solve the cooperative pursuit control problem. Instead of using broadcast to share information among the pursuers, we construct the ring topology network and the leader-follower line topology network for communication, which could significantly reduce the complexity and save the communication and computation resources. The training algorithms for these two network topologies are developed based on the deep deterministic policy gradient algorithm. Furthermore, the proposed approach is implemented in a simulation environment. The training and evaluation results demonstrate that the pursuit team could learn highly efficient cooperative control and communication policies. The pursuers can capture a superior evader driven by an intelligent escape policy with a high success rate. (c) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:101 / 114
页数:14
相关论文
共 46 条
[1]  
Abadi M, 2016, ACM SIGPLAN NOTICES, V51, P1, DOI [10.1145/3022670.2976746, 10.1145/2951913.2976746]
[2]  
[Anonymous], ARXIV170508926
[3]  
[Anonymous], 2016, THESIS
[4]  
[Anonymous], ARXIV170304908
[5]  
Awheda M.D., 2016, P 2016 ANN IEEE SYST, P1
[6]  
Bilgin AT, 2015, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), P164, DOI 10.1109/ICAR.2015.7251450
[7]   POINT CAPTURE OF 2 EVADERS IN SUCCESSION [J].
BREAKWELL, JV ;
HAGEDORN, P .
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1979, 27 (01) :89-97
[8]   A comprehensive survey of multiagent reinforcement learning [J].
Busoniu, Lucian ;
Babuska, Robert ;
De Schutter, Bart .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2008, 38 (02) :156-172
[9]   Concise deep reinforcement learning obstacle avoidance for underactuated unmanned marine vessels [J].
Cheng, Yin ;
Zhang, Weidong .
NEUROCOMPUTING, 2018, 272 :63-73
[10]   Reinforcement learning-based asymptotic cooperative tracking of a class multi-agent dynamic systems using neural networks [J].
Cui, Lili ;
Wang, Xiaowei ;
Zhang, Yong .
NEUROCOMPUTING, 2016, 171 :220-229