Multi-robot Target Encirclement Control with Collision Avoidance via Deep Reinforcement Learning

被引:47
作者
Ma, Junchong [1 ]
Lu, Huimin [1 ]
Xiao, Junhao [1 ]
Zeng, Zhiwen [1 ]
Zheng, Zhiqiang [1 ]
机构
[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Hunan, Peoples R China
关键词
Multi-robot; Deep reinforcement learning; Encirclement control; Collision avoidance; ANONYMOUS MOBILE AGENTS; COOPERATIVE CONTROL; NETWORKS; SYSTEMS; GAME; GO;
D O I
10.1007/s10846-019-01106-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The target encirclement control of multi-robot systems via deep reinforcement learning has been investigated in this paper. Inspired by the encirclement behavior of dolphins to entrap the fishes, the encirclement control is mainly to enforce the robots to achieve a capturing formation pattern around a target, and can be widely applied in many areas such as coverage, patrolling, escorting, etc. Different from traditional methods, we propose a deep reinforcement learning framework for multi-robot target encirclement formation control, combining the advantages of the deep neural network and deterministic policy gradient algorithm, which is free from the complicated work of building the control model and designing the control law. Our method provides a distributed control architecture for each robot in continuous action space, relying only on local teammate information. Besides, the behavioral output at each time step is determined by its own independent network. In addition, both the robots and the moving target can be trained simultaneously. In that way, both cooperation and competition can be contained, and the results validate the effectiveness of the proposed algorithm.
引用
收藏
页码:371 / 386
页数:16
相关论文
共 44 条
[1]   A network of sensor-based framework for automated visual surveillance [J].
Aguilar-Ponce, Ruth ;
Kumar, Ashok ;
Tecpanecatl-Xihuitl, J. Luis ;
Bayoumi, Magdy .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2007, 30 (03) :1244-1271
[2]  
[Anonymous], BUILDING SOFTWARE SY
[3]  
[Anonymous], 2016, PROC INT C LEARNING
[4]  
[Anonymous], IEEE INT C INF AUT
[5]  
[Anonymous], 2016, Axiomatic Design in Large Systems, DOI [DOI 10.1007/978-3-319-32388-6, 10.1007/978-3-319-32388-6, DOI 10.1007/S10514-016-9579-8]
[6]  
[Anonymous], ARXIV170306182
[7]  
[Anonymous], 2016, ARXIV160507669
[8]  
[Anonymous], 2014, IFAC PAPERSONLINE
[9]  
[Anonymous], 2015, Financial reporting with XBRL and its impact on the accounting profession
[10]  
[Anonymous], 2014, ICML ICML 14