Defensive Escort Teams for Navigation in Crowds via Multi-Agent Deep Reinforcement Learning

被引:9
作者
Hasan, Yazied A. [1 ]
Garg, Arpit [1 ]
Sugaya, Satomi [1 ]
Tapia, Lydia [1 ]
机构
[1] Univ New Mexico, Dept Comp Sci, MSC01 11301, Albuquerque, NM 87131 USA
基金
美国国家科学基金会;
关键词
Intelligent systems; machine learning; motion planning; multi-robot systems; DIFFERENTIAL GAME; CONVOY PROTECTION;
D O I
10.1109/LRA.2020.3010203
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Coordinated defensive escorts can aid a navigating payload by positioning themselves strategically in order to maintain the safety of the payload from obstacles. In this letter, we present a novel, end-to-end solution for coordinating an escort team for protecting high-value payloads in a space crowded with interacting obstacles. Our solution employs deep reinforcement learning in order to train a team of escorts to maintain payload safety while navigating alongside the payload. The escorts utilize a trained centralized policy in a distributed fashion (i.e., no explicit communication between the escorts), relying only on range-limited positional information of the environment. Given this observation, escorts automatically prioritize obstacles to intercept and determine where to intercept them, using their repulsive interaction force to actively manipulate the environment. When compared to a payload navigating with a state-of-art algorithm for obstacle avoidance our defensive escort team increased navigation success up to 83% over escorts in static formation, up to 69% over orbiting escorts, and up to 66% compared to an analytic method providing guarantees in crowded environments. We also show that our learned solution is robust to several adaptations in the scenario including: a changing number of escorts in the team, changing obstacle density, unexpected obstacle behavior, changes in payload conformation, and added sensor noise.
引用
收藏
页码:5645 / 5652
页数:8
相关论文
共 39 条
  • [1] Analikwu CV, 2017, INT J INNOV COMPUT I, V13, P1855, DOI 10.24507/ijicic.13.06.1855
  • [2] [Anonymous], 2017, PROXIMAL POLICY OPTI
  • [3] [Anonymous], 2017, P IEEE INT S ROB RES
  • [4] [Anonymous], 2012, Dynamic Programming and Optimal Control
  • [5] [Anonymous], 1998, INTRO REINFORCEMENT
  • [6] [Anonymous], P CHI C HUM FACT COM
  • [7] Boutilier C, 1996, THEORETICAL ASPECTS OF RATIONALITY AND KNOWLEDGE, P195
  • [8] Chen M, 2014, IEEE DECIS CONTR P, P2420, DOI 10.1109/CDC.2014.7039758
  • [9] RL-RRT: Kinodynamic Motion Planning via Learning Reachability Estimators From RL Policies
    Chiang, Hao-Tien Lewis
    Hsu, Jasmine
    Fiser, Marek
    Tapia, Lydia
    Faust, Aleksandra
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (04): : 4298 - 4305
  • [10] Ding X. C., 2009, C ROB COMM CONF APR, P1