Differentiable Learning of Scalable Multi-Agent Navigation Policies

被引：4

作者：

Ye, Xiaohan ^{[1
,2
]}

Pan, Zherong ^{[1
]}

Gao, Xifeng

Wu, Kui

Ren, Bo ^{[2
]}

机构：

[1] Tencent, LightSpeed Studios, Shenzhen 518054, Peoples R China

[2] Nankai Univ, Coll Comp Sci, Tianjin 300350, Peoples R China

来源：

IEEE ROBOTICS AND AUTOMATION LETTERS | 2023年 / 8卷 / 04期

关键词：

Navigation; Task analysis; Heuristic algorithms; Trajectory; Training; Kernel; Mathematical models; Multi-robot systems; robotics and automation; swarm robotics;

D O I：

10.1109/LRA.2023.3248440

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

We present an end-to-end differentiable learning algorithm for multi-agent navigation policies. Compared with prior model-free learning algorithms, our method leads to a significant speedup via the gradient information. Our key innovation lies in a novel differentiability analysis of the optimization-based crowd simulation algorithm via the implicit function theorem. Inspired by continuum multi-agent modeling techniques, we further propose a kernel-based policy parameterization, allowing our learned policy to scale up to an arbitrary number of agents without re-training. We evaluate our algorithm on two tasks in obstacle-rich environments, partially labeled navigation and evacuation, for which loss functions can be defined making the entire task learnable in an end-to-end manner. The results show that our method can achieve more than one order of magnitude speedup over model-free baselines and readily scale to unseen target configurations and agent sizes.

引用

页码：2229 / 2236

页数：8

共 49 条

[1] Alonso-Mora J, 2012, IEEE INT CONF ROBOT, P360, DOI 10.1109/ICRA.2012.6225166
[2] Amos B, 2017, PR MACH LEARN RES, V70
[3] Generalized reciprocal collision avoidance
Bareiss, Daman
van den Berg, Jur
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2015, 34 (12) : 1501 - 1514
[4] Best A, 2016, IEEE INT CONF ROBOT, P298, DOI 10.1109/ICRA.2016.7487148
[5] Bottou Leon, 2012, Neural Networks: Tricks of the Trade. Second Edition: LNCS 7700, P421, DOI 10.1007/978-3-642-35289-8_25
[6] Chenney Stephen., 2004, Proceedings of the 2004 ACM SIGGRAPH/Euro- graphics symposium on Computer animation, P233, DOI DOI 10.1145/1028523.1028553
[7] Dergachev S., 2021, 2021 IEEE 17 INT C A, P1489
[8] Du T., 2021, ACM Trans Graph (TOG), V41, P1
[9] Functional Optimization of Fluidic Devices with Differentiable Stokes Flow
Du, Tao
Wu, Kui
Spielberg, Andrew
Matusik, Wojciech
Zhu, Bo
Sifakis, Eftychios
[J]. ACM TRANSACTIONS ON GRAPHICS, 2020, 39 (06):
[10] Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios
Fan, Tingxiang
Long, Pinxin
Liu, Wenxi
Pan, Jia
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2020, 39 (07) : 856 - 892

← 1 2 3 4 5 →