Symphony: Learning Realistic and Diverse Agents for Autonomous Driving Simulation

被引:16
作者
Igl, Maximilian [1 ]
Kim, Daewoo [1 ]
Kuefler, Alex [1 ]
Mougin, Paul [1 ]
Shah, Punit [1 ]
Shiarlis, Kyriacos [1 ]
Anguelov, Dragomir [1 ]
Palatucci, Mark [1 ]
White, Brandyn [1 ]
Whiteson, Shimon [1 ]
机构
[1] Waymo Res, Mountain View, CA 94043 USA
来源
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022) | 2022年
关键词
GO; SHOGI; CHESS; GAME;
D O I
10.1109/ICRA46639.2022.9811990
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Simulation is a crucial tool for accelerating the development of autonomous vehicles. Making simulation realistic requires models of the human road users who interact with such cars. Such models can be obtained by applying learning from demonstration (LfD) to trajectories observed by cars already on the road. However, existing LfD methods are typically insufficient, yielding policies that frequently collide or drive off the road. To address this problem, we propose Symphony, which greatly improves realism by combining conventional policies with a parallel beam search. The beam search refines these policies on the fly by pruning branches that are unfavourably evaluated by a discriminator. However, it can also harm diversity, i.e., how well the agents cover the entire distribution of realistic behaviour, as pruning can encourage mode collapse. Symphony addresses this issue with a hierarchical approach, factoring agent behaviour into goal generation and goal conditioning. The use of such goals ensures that agent diversity neither disappears during adversarial training nor is pruned away by the beam search. Experiments on both proprietary and open Waymo datasets confirm that Symphony agents learn more realistic and diverse behaviour than several baselines.
引用
收藏
页码:2445 / 2451
页数:7
相关论文
共 44 条
[1]  
[Anonymous], 2016, C NEURAL INFORM PROC
[2]  
[Anonymous], 2004, P 21 INT C MACH LEAR
[3]  
[Anonymous], 2017, C ROBOT LEARNING
[4]  
Anthony T, 2017, ADV NEUR IN, V30
[5]   A survey of robot learning from demonstration [J].
Argall, Brenna D. ;
Chernova, Sonia ;
Veloso, Manuela ;
Browning, Brett .
ROBOTICS AND AUTONOMOUS SYSTEMS, 2009, 57 (05) :469-483
[6]  
Bansal Mayank, 2018, Chauffeurnet: Learning to drive by imitating the best and synthesizing the worst
[7]  
Baram N., 2017, PR MACH LEARN RES, P622
[8]  
Behbahani F., 2018, LEARNING DEMONSTRATI
[9]  
Bergamini L., 2021, ARXIV210512332
[10]  
Bojarski Mariusz, 2016, arXiv