Safe reinforcement learning with mixture density network, with application to autonomous driving

被引:13
作者
Baheri, Ali [1 ]
机构
[1] West Virginia Univ, Dept Aerosp & Mech Engn, Morgantown, WV 26505 USA
来源
RESULTS IN CONTROL AND OPTIMIZATION | 2022年 / 6卷
关键词
Safe reinforcement learning; Multimodal trajectory prediction; Mixture density network; Autonomous highway driving;
D O I
10.1016/j.rico.2022.100095
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
This paper presents a safe reinforcement learning system for automated driving that benefits from multimodal future trajectory predictions. We propose a safety system that consists of two safety components: a rule -based and a multimodal learning -based safety system. The rule -based module is based on common driving rules. On the other hand, the multi -modal learningbased safety module is a data -driven safety rule that learns safety patterns from historical driving data. Specifically, it utilizes mixture density recurrent neural networks (MD-RNN) for multimodal future trajectory predictions to mimic the potential behaviors of an autonomous agent and consequently accelerate the learning process. Our simulation results demonstrate that the proposed safety system outperforms previously reported results in terms of average reward and collision frequency.
引用
收藏
页数:7
相关论文
共 27 条
[1]   Autonomous Helicopter Aerobatics through Apprenticeship Learning [J].
Abbeel, Pieter ;
Coates, Adam ;
Ng, Andrew Y. .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2010, 29 (13) :1608-1639
[2]  
Alshiekh M, 2018, AAAI CONF ARTIF INTE, P2669
[3]  
Altman E., 1993, ZOR, Methods and Models of Operations Research, V37, P151, DOI 10.1007/BF01414154
[4]  
Baheri A, 2020, IEEE INT VEH SYM, P1550, DOI 10.1109/IV47402.2020.9304744
[5]  
Bishop Christopher M., 1994, Tech. Rep.
[6]  
Brunke Lukas, 2021, arXiv, DOI DOI 10.48550/ARXIV.2108.06266
[7]  
Lipton ZC, 2018, Arxiv, DOI arXiv:1611.01211
[8]   DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving [J].
Chen, Chenyi ;
Seff, Ari ;
Kornhauser, Alain ;
Xiao, Jianxiong .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2722-2730
[9]  
Chentanez Nuttapong, 2005, P 17 INT C NEUR INF, P1281
[10]  
Coraluppi SP, 1997, P 31 C INFORM SCI SY