Hierarchical framework integrating rapidly-exploring random tree with deep reinforcement learning for autonomous vehicle

被引:17
作者
Yu, Jiaxing [1 ,2 ]
Arab, Aliasghar [2 ]
Yi, Jingang [2 ]
Pei, Xiaofei [1 ]
Guo, Xuexun [1 ]
机构
[1] Wuhan Univ Technol, Hubei Key Lab Adv Technol Automot Components, Luogui Rd, Wuhan 430070, Hubei, Peoples R China
[2] Rutgers State Univ, Dept Mech & Aerosp Engn, 98 Brett Rd, Piscataway, NJ 08854 USA
基金
美国国家科学基金会;
关键词
Autonomous vehicle; Reinforcement learning; Rapidly-exploring random tree (RRT); Machine learning;
D O I
10.1007/s10489-022-04358-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a systematic driving framework where the decision making module of reinforcement learning (RL) is integrated with rapidly-exploring random tree (RRT) as motion planning. RL is used to generate local goals and semantic speed commands to control the longitudinal speed of a vehicle while rewards are designed for the driving safety and the traffic efficiency. Guaranteeing the driving comfort, RRT returns a feasible path to be followed by the vehicle with the speed commands. The scene decomposition approach is implemented to scale the deep neural network (DNN) to environments with multiple traffic participants and double deep Q-networks (DDQN) with prioritized experience replay (PER) is utilized to accelerate the training process. To handle the disturbance of the perception of the agent, we use an ensemble of neural networks to evaluate the uncertainty of decisions. It has shown that the proposed framework can tackle unexpected actions of traffic participants at an intersection yielding safe, comfort and efficient driving behaviors. Also, the ensemble of DDQN with PER is proved to be superior over standard DDQN in terms of learning efficiency and disturbance vulnerability.
引用
收藏
页码:16473 / 16486
页数:14
相关论文
共 33 条
[1]   Decision-Making System for Lane Change Using Deep Reinforcement Learning in Connected and Automated Driving [J].
An, HongIl ;
Jung, Jae-il .
ELECTRONICS, 2019, 8 (05)
[2]   On the G2 Hermite Interpolation Problem with clothoids [J].
Bertolazzi, Enrico ;
Frego, Marco .
JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2018, 341 :99-116
[3]  
Bouton M, 2019, IEEE INT VEH SYM, P1469, DOI [10.1109/IVS.2019.8813803, 10.1109/ivs.2019.8813803]
[4]   Information technology capability and firm performance: Role of industry [J].
Chae, Ho-Chang ;
Koh, Chang E. ;
Park, Kwang O. .
INFORMATION & MANAGEMENT, 2018, 55 (05) :525-546
[5]  
Chen D, 2020, P AMER CONTR CONF, P4355, DOI [10.23919/ACC45564.2020.9147626, 10.23919/acc45564.2020.9147626]
[6]  
Deshpande N, 2020, I C CONT AUTOMAT ROB, P428, DOI [10.1109/icarcv50220.2020.9305435, 10.1109/ICARCV50220.2020.9305435]
[7]  
Deshpande N, 2019, IEEE INT C INTELL TR, P2081, DOI 10.1109/ITSC.2019.8917299
[8]   Sampling-Based Robot Motion Planning: A Review [J].
Elbanhawi, Mohamed ;
Simic, Milan .
IEEE ACCESS, 2014, 2 :56-77
[9]  
Gammell JD, 2014, IEEE INT C INT ROBOT, P2997, DOI 10.1109/IROS.2014.6942976
[10]  
Hoel CJ, 2020, IEEE INT VEH SYM, P1563, DOI [10.1109/iv47402.2020.9304614, 10.1109/IV47402.2020.9304614]