Decentralized Multi-Agent Pursuit Using Deep Reinforcement Learning

被引：103

作者：

de Souza, Cristino, Jr. ^{[1
,2
]}

Newbury, Rhys ^{[3
]}

Cosgun, Akansel ^{[3
]}

Castillo, Pedro ^{[1
]}

Vidolov, Boris ^{[1
]}

Kulic, Dana ^{[3
]}

机构：

[1] Univ Technol Compiegne, CNRS, Heudiasyc, 60319 CS, Compiegne, France

[2] Technol Innovat Inst, Abu Dhabi, U Arab Emirates

[3] Monash Univ, Dept Elect & Comp Syst Engn, Clayton, Vic 3800, Australia

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2021年 / 6卷 / 03期

关键词：

Reinforcement learning; Games; Drones; Kinematics; Task analysis; Trajectory; Training; Multi-robot systems; reinforcement learning; cooperating robots; SYSTEM;

D O I：

10.1109/LRA.2021.3068952

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Pursuit-evasion is the problem of capturing mobile targets with one or more pursuers. We use deep reinforcement learning for pursuing an omnidirectional target with multiple, homogeneous agents that are subject to unicycle kinematic constraints. We use shared experience to train a policy for a given number of pursuers, executed independently by each agent at run-time. The training uses curriculum learning, a sweeping-angle ordering to locally represent neighboring agents, and a reward structure that encourages a good formation and combines individual and group rewards. Simulated experiments with a reactive evader and up to eight pursuers show that our learning-based approach outperforms recent reinforcement learning techniques as well as non-holonomic adaptations of classical algorithms. The learned policy is successfully transferred to the real-world in a proof-of-concept demonstration with three motion-constrained pursuer drones.

引用

页码：4552 / 4559

页数：8

共 40 条

[1] Collective Predation and Escape Strategies [J].

Angelani, Luca .

PHYSICAL REVIEW LETTERS, 2012, 109 (11)

[2]

[Anonymous], 2016, P 4 INT C LEARN REPR

[3]

Awheda MD, 2015, CAN CON EL COMP EN, P1006, DOI 10.1109/CCECE.2015.7129412

[4]

Baker B., 2020, P INT C LEARN REPR I, P1

[5] Learning Deep Architectures for AI [J].

Bengio, Yoshua .

FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127

[6] UAV Pursuit using Reinforcement Learning [J].

Bonnet, Alexandre ;

Akhloufi, Moulay A. .

UNMANNED SYSTEMS TECHNOLOGY XXI, 2019, 11021

[7]

Calkins H, 2017, J ARRYTHM, V33, P369, DOI 10.1016/j.joa.2017.08.001

[8] Onboard Detection and Localization of Drones Using Depth Maps [J].

Carrio, Adrian ;

Tordesillas, Jesus ;

Vemprala, Sai ;

Saripalli, Srikanth ;

Campoy, Pascual ;

How, Jonathan P. .

IEEE ACCESS, 2020, 8 (08) :30480-30490

[9]

Chen JY, 2019, IEEE INT C INTELL TR, P2765, DOI [10.1109/ITSC.2019.8917306, 10.1109/itsc.2019.8917306]

[10] Search and pursuit-evasion in mobile robotics A survey [J].

Chung, Timothy H. ;

Hollinger, Geoffrey A. ;

Isler, Volkan .

AUTONOMOUS ROBOTS, 2011, 31 (04) :299-316

← 1 2 3 4 →