Safe and Interpretable Human-Like Planning With Transformer-Based Deep Inverse Reinforcement Learning for Autonomous Driving

被引：0

作者：

Nan, Jiangfeng ^{[1
]}

Zhang, Ruzheng ^{[2
]}

Yin, Guodong ^{[1
]}

Zhuang, Weichao ^{[1
]}

Zhang, Yilong ^{[3
]}

Deng, Weiwen ^{[4
]}

机构：

[1] Southeast University, School of Mechanical Engineering, Nanjing

[2] Horizon Robotics, Beijing

[3] Beihang University, School of Computer Science and Engineering, Beijing

[4] Beihang University, School of Transportation Science and Engineering, Beijing

来源：

IEEE Transactions on Automation Science and Engineering | 2025年 / 22卷

基金：

中国国家自然科学基金;

关键词：

Autonomous driving; decision-making; deep inverse reinforcement learning; interpretability; planning;

D O I：

10.1109/TASE.2025.3539340

中图分类号：

学科分类号：

摘要：

Human-like decision-making and planning are crucial for advancing the decision-making level of autonomous driving and increasing acceptance in the autonomous vehicle market, as well as for achieving data closed loop for autonomous driving. However, human-like decision-making and planning methods still face challenges in safety and interpretability, particularly in multi-vehicle interaction scenarios. In light of this, this paper proposes an interpretable human-like decision-making and planning method with Transformer-based deep inverse reinforcement learning. The proposed method employs a Transformer encoder to extract features from the scenario and determine the attention assigned by the ego vehicle to each traffic vehicle, thereby improving the interpretability of planning outcomes. Furthermore, for improved safety in planning, the model is trained on both positive and negative expert demonstrations. The experimental results show that the proposed method enhances model safety while maintaining imitation levels compared to conventional methods. Additionally, the attention allocation results closely align with those of human drivers, indicating the model's ability to elucidate the importance of each traffic vehicle for decision-making and planning, thereby improving interpretability. Therefore, the proposed method not only ensures high levels of imitation and safety but also enhances interpretability by providing accurate attention allocation results for decision-making and planning. Note to Practitioners - This paper presents a method for enhancing the planning of autonomous vehicles by making it more interpretable and safer. Using Transformer-based deep reinforcement learning, the approach improves clarity by showing how the vehicle prioritizes other traffic participants and learning from both positive and negative examples. This not only enhances safety and decision accuracy but also provides insights into the vehicle's reasoning process, which is crucial for debugging and increasing user trust. Future work could focus on adapting this method for even more complex driving scenarios. © 2004-2012 IEEE.

引用

页码：12134 / 12146

页数：12

共 56 条

[1]

Hang P., Zhang Y., Lv C., Brain-inspired modeling and decision-making for human-like autonomous driving in mixed traffic environment, IEEE Trans. Intell. Transp. Syst., 24, 10, pp. 10420-10432, (2023)

[2]

Zhang Z., Tian R., Sherony R., Domeyer J., Ding Z., Attentionbased interrelation modeling for explainable automated driving, IEEE Trans. Intell. Vehicles, 8, 2, pp. 1564-1573, (2023)

[3]

Le Mero L., Yi D., Dianati M., Mouzakitis A., A survey on imitation learning techniques for end-to-end autonomous vehicles, IEEE Trans. Intell. Transp. Syst., 23, 9, pp. 14128-14147, (2022)

[4]

Pomerleau D., ALVINN: An autonomous land vehicle in a neural network, Proc. Adv. Neural Inf. Process. Syst., 1, pp. 305-313, (1988)

[5]

Bojarski M., Et al., End to end learning for self-driving cars, (2016)

[6]

Codevilla F., Muller M., Lopez A., Koltun V., Dosovitskiy A., Endto-end driving via conditional imitation learning, Proc. IEEE Int. Conf. Robot. Autom. (ICRA), pp. 4693-4700, (2018)

[7]

Zhao C., Sun L., Yan Z., Neumann G., Duckett T., Stolkin R., Learning Kalman network: A deep monocular visual odometry for onroad driving, Robot. Auto. Syst., 121, (2019)

[8]

Grigorescu S.M., Trasnea B., Marina L., Vasilcoi A., Cocias T., NeuroTrajectory: A neuroevolutionary approach to local state trajectory learning for autonomous vehicles, IEEE Robot. Autom. Lett., 4, 4, pp. 3441-3448, (2019)

[9]

Leordeanu M., Paraicu I., Driven by vision: Learning navigation by visual localization and trajectory prediction, Sensors, 21, 3, (2021)

[10]

Arora S., Doshi P., A survey of inverse reinforcement learning: Challenges, methods and progress, Artif. Intell., 297, (2021)

← 1 2 3 4 5 6 →