Defensive Escort Teams for Navigation in Crowds via Multi-Agent Deep Reinforcement Learning

被引：11

作者：

Hasan, Yazied A. ^{[1
]}

Garg, Arpit ^{[1
]}

Sugaya, Satomi ^{[1
]}

Tapia, Lydia ^{[1
]}

机构：

[1] Univ New Mexico, Dept Comp Sci, MSC01 11301, Albuquerque, NM 87131 USA

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2020年 / 5卷 / 04期

基金：

美国国家科学基金会;

关键词：

Intelligent systems; machine learning; motion planning; multi-robot systems; DIFFERENTIAL GAME; CONVOY PROTECTION;

D O I：

10.1109/LRA.2020.3010203

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Coordinated defensive escorts can aid a navigating payload by positioning themselves strategically in order to maintain the safety of the payload from obstacles. In this letter, we present a novel, end-to-end solution for coordinating an escort team for protecting high-value payloads in a space crowded with interacting obstacles. Our solution employs deep reinforcement learning in order to train a team of escorts to maintain payload safety while navigating alongside the payload. The escorts utilize a trained centralized policy in a distributed fashion (i.e., no explicit communication between the escorts), relying only on range-limited positional information of the environment. Given this observation, escorts automatically prioritize obstacles to intercept and determine where to intercept them, using their repulsive interaction force to actively manipulate the environment. When compared to a payload navigating with a state-of-art algorithm for obstacle avoidance our defensive escort team increased navigation success up to 83% over escorts in static formation, up to 69% over orbiting escorts, and up to 66% compared to an analytic method providing guarantees in crowded environments. We also show that our learned solution is robust to several adaptations in the scenario including: a changing number of escorts in the team, changing obstacle density, unexpected obstacle behavior, changes in payload conformation, and added sensor noise.

引用

页码：5645 / 5652

页数：8

共 39 条

[1]

Analikwu CV, 2017, INT J INNOV COMPUT I, V13, P1855, DOI 10.24507/ijicic.13.06.1855

[2]

[Anonymous], 2017, PROXIMAL POLICY OPTI

[3]

[Anonymous], 2017, P IEEE INT S ROB RES

[4]

[Anonymous], P CHI C HUM FACT COM

[5]

Bertsekas, 2012, DYNAMIC PROGRAMMING

[6]

Boutilier C, 1996, THEORETICAL ASPECTS OF RATIONALITY AND KNOWLEDGE, P195

[7]

Chen M, 2014, IEEE DECIS CONTR P, P2420, DOI 10.1109/CDC.2014.7039758

[8] RL-RRT: Kinodynamic Motion Planning via Learning Reachability Estimators From RL Policies [J].

Chiang, Hao-Tien Lewis ;

Hsu, Jasmine ;

Fiser, Marek ;

Tapia, Lydia ;

Faust, Aleksandra .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (04) :4298-4305

[9]

Ding X. C., 2009, C ROB COMM CONF APR, P1

[10] Multi-UAV Convoy Protection: An Optimal Approach to Path Planning and Coordination [J].

Ding, Xu Chu ;

Rahmani, Amir R. ;

Egerstedt, Magnus .

IEEE TRANSACTIONS ON ROBOTICS, 2010, 26 (02) :256-268

← 1 2 3 4 →