Hybrid Car-Following Strategy Based on Deep Deterministic Policy Gradient and Cooperative Adaptive Cruise Control

被引：38

作者：

Yan, Ruidong ^{[1
]}

Jiang, Rui ^{[1
]}

Jia, Bin ^{[1
]}

Huang, Jin ^{[2
]}

Yang, Diange ^{[2
]}

机构：

[1] Beijing Jiaotong Univ, Sch Traff & Transportat, Beijing 100044, Peoples R China

[2] Tsinghua Univ, Sch Vehicle & Mobil, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING | 2022年 / 19卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Mathematical model; Differential equations; Cruise control; Training; Reinforcement learning; Adaptation models; Space exploration; Car-following; cooperative adaptive cruise control (CACC); deep deterministic policy gradient (DDPG); hybrid strategy;

D O I：

10.1109/TASE.2021.3100709

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep deterministic policy gradient (DDPG)-based car-following strategy can break through the constraints of the differential equation model due to the ability of exploration on complex environments. However, the car-following performance of DDPG is usually degraded by unreasonable reward function design, insufficient training, and low sampling efficiency. In order to solve this kind of problem, a hybrid car-following strategy based on DDPG and cooperative adaptive cruise control (CACC) is proposed. First, the car-following process is modeled as the Markov decision process to calculate CACC and DDPG simultaneously at each frame. Given a current state, two actions are obtained from CACC and DDPG, respectively. Then, an optimal action, corresponding to the one offering a larger reward, is chosen as the output of the hybrid strategy. Meanwhile, a rule is designed to ensure that the change rate of acceleration is smaller than the desired value. Therefore, the proposed strategy not only guarantees the basic performance of car-following through CACC but also makes full use of the advantages of exploration on complex environments via DDPG. Finally, simulation results show that the car-following performance of the proposed strategy is improved compared with that of DDPG and CACC.

引用

页码：2816 / 2824

页数：9

共 50 条

[21] Deep Recurrent Deterministic Policy Gradient for Physical Control [J].

Zhang, Lei ;

Han, Shuai ;

Zhang, Zhiruo ;

Li, Lefan ;

Lu, Shuai .

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 :257-268

[22] Robust Control Strategy for Quadrotor Drone Using Reference Model-Based Deep Deterministic Policy Gradient [J].

Liu, Hongxun ;

Suzuki, Satoshi ;

Wang, Wei ;

Liu, Hao ;

Wang, Qi .

DRONES, 2022, 6 (09)

[23] Cooperative Adaptive Cruise Control Strategy Optimization for Electric Vehicles Based on SA-PSO With Model Predictive Control [J].

Ma, Hao ;

Chu, Liang ;

Guo, Jianhua ;

Wang, Jiawei ;

Guo, Chong .

IEEE ACCESS, 2020, 8 :225745-225756

[24] Developing Flight Control Policy Using Deep Deterministic Policy Gradient [J].

Tsourdos, Antonios ;

Permana, Adhi Dharma ;

Budiarti, Dewi H. ;

Shin, Hyo-Sang ;

Lee, Chang-Hun .

2019 IEEE INTERNATIONAL CONFERENCE ON AEROSPACE ELECTRONICS AND REMOTE SENSING TECHNOLOGY (ICARES 2019), 2019,

[25] Towards robust car-following based on deep reinforcement learning [J].

Hart, Fabian ;

Okhrin, Ostap ;

Treiber, Martin .

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2024, 159

[26] Driver Car-Following Model Based on Deep Reinforcement Learning [J].

Guo J. ;

Li W. ;

Luo Y. ;

Chen T. ;

Li K. .

Guo, Jinghua (guojh@xmu.edu.cn), 1600, SAE-China (43) :571-579

[27] Cooperative Car-Following Control: Distributed Algorithm and Impact on Moving Jam Features [J].

Wang, Meng ;

Daamen, Winnie ;

Hoogendoorn, Serge P. ;

van Arem, Bart .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2016, 17 (05) :1459-1471

[28] Cooperative control of velocity and heading for unmanned surface vessel based on twin delayed deep deterministic policy gradient with an integral compensator [J].

Wang, Yibai ;

Zhao, Shulong ;

Wang, Qingling .

OCEAN ENGINEERING, 2023, 288

[29] Speed cascade adaptive control for hybrid electric vehicle using electronic throttle control during car-following process [J].

Xue, Jiaqi ;

Jiao, Xiaohong .

ISA TRANSACTIONS, 2021, 110 :328-343

[30] Hybrid Formation Control for Multi-Robot Hunters Based on Multi-Agent Deep Deterministic Policy Gradient [J].

Hamed O. ;

Hamlich M. .

Mendel, 2021, 27 (02) :23-29

← 1 2 3 4 5 →