Hybrid Car-Following Strategy Based on Deep Deterministic Policy Gradient and Cooperative Adaptive Cruise Control

被引:38
作者
Yan, Ruidong [1 ]
Jiang, Rui [1 ]
Jia, Bin [1 ]
Huang, Jin [2 ]
Yang, Diange [2 ]
机构
[1] Beijing Jiaotong Univ, Sch Traff & Transportat, Beijing 100044, Peoples R China
[2] Tsinghua Univ, Sch Vehicle & Mobil, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Mathematical model; Differential equations; Cruise control; Training; Reinforcement learning; Adaptation models; Space exploration; Car-following; cooperative adaptive cruise control (CACC); deep deterministic policy gradient (DDPG); hybrid strategy;
D O I
10.1109/TASE.2021.3100709
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep deterministic policy gradient (DDPG)-based car-following strategy can break through the constraints of the differential equation model due to the ability of exploration on complex environments. However, the car-following performance of DDPG is usually degraded by unreasonable reward function design, insufficient training, and low sampling efficiency. In order to solve this kind of problem, a hybrid car-following strategy based on DDPG and cooperative adaptive cruise control (CACC) is proposed. First, the car-following process is modeled as the Markov decision process to calculate CACC and DDPG simultaneously at each frame. Given a current state, two actions are obtained from CACC and DDPG, respectively. Then, an optimal action, corresponding to the one offering a larger reward, is chosen as the output of the hybrid strategy. Meanwhile, a rule is designed to ensure that the change rate of acceleration is smaller than the desired value. Therefore, the proposed strategy not only guarantees the basic performance of car-following through CACC but also makes full use of the advantages of exploration on complex environments via DDPG. Finally, simulation results show that the car-following performance of the proposed strategy is improved compared with that of DDPG and CACC.
引用
收藏
页码:2816 / 2824
页数:9
相关论文
共 50 条
[21]   Deep Recurrent Deterministic Policy Gradient for Physical Control [J].
Zhang, Lei ;
Han, Shuai ;
Zhang, Zhiruo ;
Li, Lefan ;
Lu, Shuai .
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 :257-268
[22]   Robust Control Strategy for Quadrotor Drone Using Reference Model-Based Deep Deterministic Policy Gradient [J].
Liu, Hongxun ;
Suzuki, Satoshi ;
Wang, Wei ;
Liu, Hao ;
Wang, Qi .
DRONES, 2022, 6 (09)
[23]   Cooperative Adaptive Cruise Control Strategy Optimization for Electric Vehicles Based on SA-PSO With Model Predictive Control [J].
Ma, Hao ;
Chu, Liang ;
Guo, Jianhua ;
Wang, Jiawei ;
Guo, Chong .
IEEE ACCESS, 2020, 8 :225745-225756
[24]   Developing Flight Control Policy Using Deep Deterministic Policy Gradient [J].
Tsourdos, Antonios ;
Permana, Adhi Dharma ;
Budiarti, Dewi H. ;
Shin, Hyo-Sang ;
Lee, Chang-Hun .
2019 IEEE INTERNATIONAL CONFERENCE ON AEROSPACE ELECTRONICS AND REMOTE SENSING TECHNOLOGY (ICARES 2019), 2019,
[25]   Towards robust car-following based on deep reinforcement learning [J].
Hart, Fabian ;
Okhrin, Ostap ;
Treiber, Martin .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2024, 159
[26]   Driver Car-Following Model Based on Deep Reinforcement Learning [J].
Guo J. ;
Li W. ;
Luo Y. ;
Chen T. ;
Li K. .
Guo, Jinghua (guojh@xmu.edu.cn), 1600, SAE-China (43) :571-579
[27]   Cooperative Car-Following Control: Distributed Algorithm and Impact on Moving Jam Features [J].
Wang, Meng ;
Daamen, Winnie ;
Hoogendoorn, Serge P. ;
van Arem, Bart .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2016, 17 (05) :1459-1471
[28]   Cooperative control of velocity and heading for unmanned surface vessel based on twin delayed deep deterministic policy gradient with an integral compensator [J].
Wang, Yibai ;
Zhao, Shulong ;
Wang, Qingling .
OCEAN ENGINEERING, 2023, 288
[29]   Speed cascade adaptive control for hybrid electric vehicle using electronic throttle control during car-following process [J].
Xue, Jiaqi ;
Jiao, Xiaohong .
ISA TRANSACTIONS, 2021, 110 :328-343
[30]   Hybrid Formation Control for Multi-Robot Hunters Based on Multi-Agent Deep Deterministic Policy Gradient [J].
Hamed O. ;
Hamlich M. .
Mendel, 2021, 27 (02) :23-29