Collaborative Study of Decision-making and Trajectory Planning for Autonomous Driving Based on Soft Acto-Critic Algorithm

被引:0
|
作者
Tang B. [1 ]
Liu G. [1 ]
Jiang H. [1 ]
Tian N. [1 ]
Mi W. [1 ]
Wang C. [2 ]
机构
[1] Automotive Engineering Research Institute, Jiangsu University, Jiangsu, Zhenjiang
[2] Jiangsu Gangyang Steering System Co Ltd, Jiangsu, Taizhou
基金
中国国家自然科学基金;
关键词
autonomous driving; collaborative decision and planning; deep reinforcement learning; intelligent transportation; soft actor-critic algorithm;
D O I
10.16097/j.cnki.1009-6744.2024.02.011
中图分类号
学科分类号
摘要
To improve the learning speed, safety and rationality of autonomous driving decision-making, this paper proposed a collaborative method of autonomous driving decision-making and planning based on Soft Actor-Critic (SAC) algorithm. The autonomous driving decision planning collaborative agent was designed by introducing the SAC algorithm with the rule-based decision planning method. Combined with the Self-Attention Mechanism (SAM) and the Gated Recurrent Unit (GRU), a preprocessing network was constructed to improve the agent's ability to understand traffic scenarios and improve the agent's learning speed. Considering the specific implementation mode of the planning module, the study used the action space to improve the executability of the decision- making results. The reward function was designed by using the information feedback, adding the constraints of vehicle driving conditions to the agent, and transmitting the trajectory information to the decision-making module. The collaboration of decision-making and planning improved the safety and rationality of decision-making. The dynamic traffic scenarios were built in the CARLA autonomous driving simulation platform to train the agent, and the proposed decision-making and planning collaboration method was compared with the conventional decision-making planning method based on SAC algorithm in different scenarios. The experimental results showed that the learning speed of the agent designed in this paper had increased by 25.10% . The average vehicle speed generated by its decision outcomes was higher and closer to the expected road speed. The speed variation rate produced by its decision outcomes was smaller, and the path length and curvature variation rate resulting from its decision outcomes were also smaller compared to traditional methods. © 2024 Science Press. All rights reserved.
引用
收藏
页码:105 / 113
页数:8
相关论文
共 9 条
  • [1] WANG Z, CHEN X, WANG P, Et al., A decision-making model for autonomous vehicles at urban intersections based on conflict resolution, Journal of Advanced Transportation, 20, 12, pp. 1-12, (2021)
  • [2] PAN Y, CHENG C A, SAIGOL K, Et al., Imitation learning for agile autonomous driving, The International Journal of Robotics Research, 39, 3, pp. 286-302, (2020)
  • [3] QUEK Y T, KOH L L, KOH N T, Et al., Deep Q-network implementation for simulated autonomous vehicle control, IET Intelligent Transport Systems, 15, 7, pp. 875-885, (2021)
  • [4] GAO Z H, YAN X T, GAO F, Et al., A driver-like decision-making method for longitudinal autonomous driving based on DDPG, Automotive Engineering, 43, 12, pp. 1737-1744, (2021)
  • [5] ZHANG Q, HU W, DUAN J, Et al., Cooperative scheduling of AGV and ASC in automation container terminal relay operation mode, Mathematical Problems in Engineering, 18, 9, pp. 1-18, (2021)
  • [6] KHERROUBI Z, AKNINE S, BACHA R., Novel decision-making strategy for connected and autonomous vehicles in highway on-ramp merging, IEEE Transactions on Intelligent Transportation Systems, 23, 8, pp. 12490-12502, (2021)
  • [7] ZHAO Z G, SHI C D, LIANG Z H, Et al., Research on adaptive cruise control based on flexible actor-critic algorithm, Automotive Technology, 15, 3, pp. 26-34, (2023)
  • [8] ZHANG L, ZHANG R, WU T, Et al., Safe reinforcement learning with stability guarantee for motion planning of autonomous vehicles, IEEE Transactions on Neural Networks and Learning Systems, 32, 12, pp. 5435-5444, (2021)
  • [9] LV C, LU H L, YU Y, Et al., Autonomous overtaking decision system based on hierarchical reinforcement learning and social preferences, China Journal of Highway and Transport, 35, 3, pp. 115-126, (2022)