Longitudinal Control of Automated Vehicles: A Novel Approach by Integrating Deep Reinforcement Learning With Intelligent Driver Model

被引：1

作者：

Bai, Linhan ^{[1
,2
,3
]}

Zheng, Fangfang ^{[1
,2
,3
]}

Hou, Kangning ^{[1
,2
,3
]}

Liu, Xiaobo ^{[1
,2
,3
]}

Lu, Liang ^{[1
,2
,3
]}

Liu, Can ^{[1
,2
,3
]}

机构：

[1] Southwest Jiaotong Univ, Sch Transportat & Logist, Chengdu 611756, Peoples R China

[2] Southwest Jiaotong Univ, Natl Engn Lab Integrated Transportat Big Data Appl, Chengdu 611756, Peoples R China

[3] Southwest Jiaotong Univ, Natl United Engn Lab Integrated & Intelligent Tran, Chengdu 611756, Peoples R China

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2024年 / 73卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Adaptation models; Safety; Neural networks; Training; Numerical models; Computational modeling; Vehicles; Longitudinal control; automated vehicle; deep reinforcement learning; combined model; ADAPTIVE CRUISE CONTROL; DECISION-MAKING; CONGESTION;

D O I：

10.1109/TVT.2024.3376599

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Deep reinforcement learning (DRL) provides a promising approach for the implementation of autonomous driving. By utilizing a trained DRL model as the longitudinal controller, the automated vehicle (AV) can generate optimal action outputs based on the state within a shorter time compared to traditional model predictive control (MPC) methods. However, the non-interpretability of neural networks poses a potential risk for real-world vehicle operation. This paper focuses on applying the Twin Delayed Deep Deterministic Policy Gradient (TD3), a state-of-the-art (SOTA) DRL algorithm, to train the longitudinal control model for AVs. We confirm the risks associated with the TD3-based longitudinal control model by assessing its violation of the rational driving constraint (RDC), which represents the basic conditions for normal driving behaviors. To mitigate these risks, we propose a novel model that integrates the TD3-based model with the intelligent driver model (IDM) using a new indicator called velocity response time (VRT). This indicator identifies risky outputs of the TD3-based model and calculates the combined weights of both the IDM and TD3-based models. This combination allows us to reduce risks associated with the non-interpretability of the neural network while also capturing the effect of engine time lag. Numerical simulations are conducted to evaluate the performance of the proposed combined model. The results demonstrate that the proposed combined model outperforms the TD3-based model, IDM, and another SOTA approach in terms of disturbance mitigation, safety improvement, and suppression of traffic oscillation. Additionally, the combined model exhibits greater computational efficiency than MPC, making it well-suited for real-time control of AVs.

引用

页码：11014 / 11028

页数：15

共 46 条

[1] Abdallaoui S., 2022, Advances in Automation, Mechanical and Design Engineering (Mechanisms and Machine Science), P133, DOI [10.1007/978-3-031-09909-0_10, DOI 10.1007/978-3-031-09909-0_10]
[2] Bae I, 2020, Arxiv, DOI arXiv:2001.03908
[3] DYNAMICAL MODEL OF TRAFFIC CONGESTION AND NUMERICAL-SIMULATION
BANDO, M
HASEBE, K
NAKAYAMA, A
SHIBATA, A
SUGIYAMA, Y
[J]. PHYSICAL REVIEW E, 1995, 51 (02): : 1035 - 1042
[4] Autonomous navigation at unsignalized intersections: A coupled reinforcement learning and model predictive control approach
Bautista-Montesano, Rolando
Galluzzi, Renato
Ruan, Kangrui
Fu, Yongjie
Di, Xuan
[J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2022, 139
[5] Combining Model-Based Controllers and Generative Adversarial Imitation Learning for Traffic Simulation
Chen, Haonan
Ji, Tianchen
Liu, Shuijing
Driggs-Campbell, Katherine
[J]. 2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2022, : 1698 - 1704
[6] Joint Optimization of Sensing, Decision-Making and Motion-Controlling for Autonomous Vehicles: A Deep Reinforcement Learning Approach
Chen, Longquan
He, Ying
Wang, Qiang
Pan, Weike
Ming, Zhong
[J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (05) : 4642 - 4654
[7] Economic Adaptive Cruise Control for Electric Vehicles Based on ADHDP in a Car-Following Scenario
Chen, Xiyan
Yang, Jian
Zhai, Chunjie
Lou, Jiedong
Yan, Chenggang
[J]. IEEE ACCESS, 2021, 9 : 74949 - 74958
[8] Derbel O., 2013, IFAC Proc, V46, P744, DOI DOI 10.3182/20130904-4-JP-2042.00132
[9] Hybrid Autonomous Driving Guidance Strategy Combining Deep Reinforcement Learning and Expert System
Fu, Yuchuan
Li, Changle
Yu, F. Richard
Luan, Tom H.
Zhang, Yao
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 11273 - 11286
[10] Fujimoto S, 2018, PR MACH LEARN RES, V80

← 1 2 3 4 5 →