Constraints Driven Safe Reinforcement Learning for Autonomous Driving Decision-Making

被引：1

作者：

Gao, Fei ^{[1
,2
]}

Wang, Xiaodong ^{[1
]}

Fan, Yuze ^{[1
]}

Gao, Zhenhai ^{[1
,2
]}

Zhao, Rui ^{[1
]}

机构：

[1] Jilin Univ, Coll Automot Engn, Changchun 130025, Peoples R China

[2] Jilin Univ, Natl Key Lab Automot Chassis Integrat & Bion, Changchun 130025, Peoples R China

来源：

IEEE ACCESS | 2024年 / 12卷

基金：

美国国家科学基金会;

关键词：

Autonomous vehicles; Safety; Road transportation; Decision making; Planning; Measurement; Accuracy; Autonomous driving; Reinforcement learning; constrained policy optimization; reinforcement learning;

D O I：

10.1109/ACCESS.2024.3454249

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Although reinforcement learning (RL) methodologies exhibit potential in addressing decision-making and planning problems in autonomous driving, ensuring the safety of the vehicle under all circumstances remains a formidable challenge in practical applications. Current RL methods are predominantly driven by singular reward mechanisms, frequently encountering difficulties in balancing multiple sub-rewards such as safety, comfort, and efficiency. To address these limitations, this paper introduces a constraint-driven safety RL method, applied to decision-making and planning policy in highway scenarios. This method ensures decisions maximize performance rewards within the bounds of safety constraints, exhibiting exceptional robustness. Initially, the framework reformulates the autonomous driving decision-making problem as a Constrained Markov Decision Process (CMDP) within the safety RL framework. It then introduces a Multi-Level Safety-Constrained Policy Optimization (MLSCPO) method, incorporating a cost function to address safety constraints. Ultimately, simulated tests conducted within the CARLA environment demonstrate that the proposed method MLSCPO outperforms the current advanced safe reinforcement learning policy, Proximal Policy Optimization with Lagrangian (PPO-Lag) and the traditional stable longitudinal and lateral autonomous driving model, Intelligent Driver Model with Minimization of Overall Braking Induced by Lane Changes (IDM+MOBIL). Compared to the classic IDM+MOBIL method, the proposed approach not only achieves efficient driving but also offers a better driving experience. In comparison with the reinforcement learning method PPO-Lag, it significantly enhances safety while ensuring driving efficiency, achieving a zero-collision rate. In the future, we will integrate the aforementioned potential expansion plans to enhance the usability and generalization capabilities of the method in real-world applications.

引用

页码：128007 / 128023

页数：17

共 30 条

[1] Self-Learned Autonomous Driving at Unsignalized Intersections: A Hierarchical Reinforced Learning Approach for Feasible Decision-Making [J].

Al-Sharman, Mohammad ;

Dempster, Rowan ;

Daoud, Mohamed A. ;

Nasr, Mahmoud ;

Rayside, Derek ;

Melek, William .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (11) :12345-12356

[2] Odin: Team VictorTango's entry in the DARPA Urban Challenge [J].

Bacha, Andrew ;

Bauman, Cheryl ;

Faruque, Ruel ;

Fleming, Michael ;

Terwelp, Chris ;

Reinholtz, Charles ;

Hong, Dennis ;

Wicks, Al ;

Alberi, Thomas ;

Anderson, David ;

Cacciola, Stephen ;

Currier, Patrick ;

Dalton, Aaron ;

Farmer, Jesse ;

Hurdus, Jesse ;

Kimmel, Shawn ;

King, Peter ;

Taylor, Andrew ;

Van Covern, David ;

Webster, Mike .

JOURNAL OF FIELD ROBOTICS, 2008, 25 (08) :467-492

[3] Adversarial Evaluation of Autonomous Vehicles in Lane-Change Scenarios [J].

Chen, Baiming ;

Chen, Xiang ;

Wu, Qiong ;

Li, Liang .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) :10333-10342

[4] A hierarchical framework of emergency collision avoidance amid surrounding vehicles in highway driving [J].

Cui, Qingjia ;

Ding, Rongjun ;

Wei, Chongfeng ;

Zhou, Bing .

CONTROL ENGINEERING PRACTICE, 2021, 109

[5] Deep Reinforcement Learning Based Decision-Making Strategy of Autonomous Vehicle in Highway Uncertain Driving Environments [J].

Deng, Huifan ;

Zhao, Youqun ;

Wang, Qiuwei ;

Nguyen, Anh-Tu .

AUTOMOTIVE INNOVATION, 2023, 6 (03) :438-452

[6] Deep reinforcement learning navigation via decision transformer in autonomous driving [J].

Ge, Lun ;

Zhou, Xiaoguang ;

Li, Yongqiang ;

Wang, Yongcong .

FRONTIERS IN NEUROROBOTICS, 2024, 18

[7]

Gu SX, 2016, PR MACH LEARN RES, V48

[8]

Jang HC, 2020, I C INF COMM TECH CO, P567, DOI [10.1109/ICTC49870.2020.9289269, 10.1109/ictc49870.2020.9289269]

[9]

Kang BY, 2018, PR MACH LEARN RES, V80

[10]

Kuutti S, 2020, IEEE INT CONF ROBOT, P108, DOI [10.1109/icra40945.2020.9197351, 10.1109/ICRA40945.2020.9197351]

← 1 2 3 →