Decentralized Control of Quadrotor Swarms with End-to-end Deep Reinforcement Learning

被引:0
作者
Batra, Sumeet [1 ]
Huang, Zhehui [1 ]
Petrenko, Aleksei [1 ]
Kumar, Tushar [1 ]
Molchanov, Artem [1 ]
Sukhatme, Gaurav S. [1 ]
机构
[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
来源
CONFERENCE ON ROBOT LEARNING, VOL 164 | 2021年 / 164卷
关键词
Swarms; Multi-robot systems; Multi-robot learning; TRAJECTORY GENERATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We demonstrate the possibility of learning drone swarm controllers that are zero-shot transferable to real quadrotors via large-scale multi-agent end-to-end reinforcement learning. We train policies parameterized by neural networks that are capable of controlling individual drones in a swarm in a fully decentralized manner. Our policies, trained in simulated environments with realistic quadrotor physics, demonstrate advanced flocking behaviors, perform aggressive maneuvers in tight formations while avoiding collisions with each other, break and re-establish formations to avoid collisions with moving obstacles, and efficiently coordinate in pursuit-evasion tasks. We analyze, in simulation, how different model architectures and parameters of the training regime influence the final performance of neural swarms. We demonstrate the successful deployment of the model learned in simulation to highly resource-constrained physical quadrotors performing station keeping and goal swapping behaviors. Video demonstrations and source code are available at the project website https://sites.google.com/view/swarm-rl.
引用
收藏
页码:576 / 586
页数:11
相关论文
共 34 条
[1]  
Allen R., 2016, AIAA GUIDANCE NAVIGA, P1374
[2]  
Batra S., 2021, arXiv
[3]  
Chen CG, 2019, IEEE INT CONF ROBOT, P6015, DOI [10.1109/ICRA.2019.8794134, 10.1109/icra.2019.8794134]
[4]  
Forster J., 2015, SYSTEM IDENTIFICATIO
[5]  
Gupta Jayesh K., 2017, Autonomous Agents and Multiagent Systems, AAMAS 2017: Workshops, Best Papers. Revised Selected Papers: LNAI 10642, P66, DOI 10.1007/978-3-319-71682-4_5
[6]   Trajectory Planning for Quadrotor Swarms [J].
Honig, Wolfgang ;
Preiss, James A. ;
Kumar, T. K. Satish ;
Sukhatme, Gaurav S. ;
Ayanian, Nora .
IEEE TRANSACTIONS ON ROBOTICS, 2018, 34 (04) :856-869
[7]   Sampling-based algorithms for optimal motion planning [J].
Karaman, Sertac ;
Frazzoli, Emilio .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2011, 30 (07) :846-894
[8]  
Khan A, 2019, PR MACH LEARN RES, V100
[9]   Large Scale Distributed Collaborative Unlabeled Motion Planning With Graph Policy Gradients [J].
Khan, Arbaaz ;
Kumar, Vijay ;
Ribeiro, Alejandro .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (03) :5340-5347
[10]  
Khan A, 2019, IEEE INT C INT ROBOT, P7558, DOI [10.1109/iros40897.2019.8968483, 10.1109/IROS40897.2019.8968483]