Differentiable Learning of Scalable Multi-Agent Navigation Policies

被引:4
作者
Ye, Xiaohan [1 ,2 ]
Pan, Zherong [1 ]
Gao, Xifeng
Wu, Kui
Ren, Bo [2 ]
机构
[1] Tencent, LightSpeed Studios, Shenzhen 518054, Peoples R China
[2] Nankai Univ, Coll Comp Sci, Tianjin 300350, Peoples R China
关键词
Navigation; Task analysis; Heuristic algorithms; Trajectory; Training; Kernel; Mathematical models; Multi-robot systems; robotics and automation; swarm robotics;
D O I
10.1109/LRA.2023.3248440
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
We present an end-to-end differentiable learning algorithm for multi-agent navigation policies. Compared with prior model-free learning algorithms, our method leads to a significant speedup via the gradient information. Our key innovation lies in a novel differentiability analysis of the optimization-based crowd simulation algorithm via the implicit function theorem. Inspired by continuum multi-agent modeling techniques, we further propose a kernel-based policy parameterization, allowing our learned policy to scale up to an arbitrary number of agents without re-training. We evaluate our algorithm on two tasks in obstacle-rich environments, partially labeled navigation and evacuation, for which loss functions can be defined making the entire task learnable in an end-to-end manner. The results show that our method can achieve more than one order of magnitude speedup over model-free baselines and readily scale to unseen target configurations and agent sizes.
引用
收藏
页码:2229 / 2236
页数:8
相关论文
共 49 条
  • [1] Alonso-Mora J, 2012, IEEE INT CONF ROBOT, P360, DOI 10.1109/ICRA.2012.6225166
  • [2] Amos B, 2017, PR MACH LEARN RES, V70
  • [3] Generalized reciprocal collision avoidance
    Bareiss, Daman
    van den Berg, Jur
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2015, 34 (12) : 1501 - 1514
  • [4] Best A, 2016, IEEE INT CONF ROBOT, P298, DOI 10.1109/ICRA.2016.7487148
  • [5] Bottou Leon, 2012, Neural Networks: Tricks of the Trade. Second Edition: LNCS 7700, P421, DOI 10.1007/978-3-642-35289-8_25
  • [6] Chenney Stephen., 2004, Proceedings of the 2004 ACM SIGGRAPH/Euro- graphics symposium on Computer animation, P233, DOI DOI 10.1145/1028523.1028553
  • [7] Dergachev S., 2021, 2021 IEEE 17 INT C A, P1489
  • [8] Du T., 2021, ACM Trans Graph (TOG), V41, P1
  • [9] Functional Optimization of Fluidic Devices with Differentiable Stokes Flow
    Du, Tao
    Wu, Kui
    Spielberg, Andrew
    Matusik, Wojciech
    Zhu, Bo
    Sifakis, Eftychios
    [J]. ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (06):
  • [10] Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios
    Fan, Tingxiang
    Long, Pinxin
    Liu, Wenxi
    Pan, Jia
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2020, 39 (07) : 856 - 892