Centralized Cooperation for Connected and Automated Vehicles at Intersections by Proximal Policy Optimization

被引:120
作者
Guan, Yang [1 ,2 ]
Ren, Yangang [1 ,2 ]
Li, Shengbo Eben [1 ,2 ]
Sun, Qi [1 ,2 ]
Luo, Laiquan [1 ,2 ]
Li, Keqiang [1 ,2 ]
机构
[1] Tsinghua Univ, State Key Lab Automot Safety & Energy, Sch Vehicle & Mobil, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Ctr Intelligent Connected Vehicles & Transportat, Beijing 100084, Peoples R China
关键词
Optimization; Computational modeling; Acceleration; Linear programming; Real-time systems; Safety; Trajectory; Centralized coordination method; connected and automated vehicle; reinforcement learning; traffic intersection;
D O I
10.1109/TVT.2020.3026111
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Connected vehicles will change the modes of future transportation management and organization, especially at an intersection without traffic light. Centralized coordination methods globally coordinate vehicles approaching the intersection from all sections by considering their states altogether. However, they need substantial computation resources since they own a centralized controller to optimize the trajectories for all approaching vehicles in real-time. In this paper, we propose a centralized coordination scheme of automated vehicles at an intersection without traffic signals using reinforcement learning (RL) to address low computation efficiency suffered by current centralized coordination methods. We first propose an RL training algorithm, model accelerated proximal policy optimization (MA-PPO), which incorporates a prior model into proximal policy optimization (PPO) algorithm to accelerate the learning process in terms of sample efficiency. Then we present the design of state, action and reward to formulate centralized coordination as an RL problem. Finally, we train a coordinate policy in a simulation setting and compare computing time and traffic efficiency with a coordination scheme based on model predictive control (MPC) method. Results show that our method spends only 1/400 of the computing time of MPC and increase the efficiency of the intersection by 4.5 times.
引用
收藏
页码:12597 / 12608
页数:12
相关论文
共 29 条
[1]   Quality-of-Experience-Oriented Autonomous Intersection Control in Vehicular Networks [J].
Dai, Penglin ;
Liu, Kai ;
Zhuge, Qingfeng ;
Sha, Edwin H. -M. ;
Lee, Victor Chung Sing ;
Son, Sang Hyuk .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2016, 17 (07) :1956-1967
[2]  
Deisenroth M., 2011, PROC INT C MACH LEAR, P465
[3]  
Dosovitskiy A, 2017, PR MACH LEARN RES, V78
[4]   Hierarchical reinforcement learning for self-driving decision-making without reliance on labelled driving data [J].
Duan, Jingliang ;
Eben Li, Shengbo ;
Guan, Yang ;
Sun, Qi ;
Cheng, Bo .
IET INTELLIGENT TRANSPORT SYSTEMS, 2020, 14 (05) :297-305
[5]  
Feinberg V., 2018, Model-based value estimation for efficient model-free reinforcement Learning
[6]   A real-time adaptive signal control in a connected vehicle environment [J].
Feng, Yiheng ;
Head, K. Larry ;
Khoshmagham, Shayan ;
Zamanipour, Mehdi .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2015, 55 :460-473
[7]   Distributed Adaptive Sliding Mode Control of Vehicular Platoon With Uncertain Interaction Topology [J].
Gao, Feng ;
Hu, Xiaosong ;
Li, Shengbo Eben ;
Li, Keqiang ;
Sun, Qi .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2018, 65 (08) :6352-6361
[8]   Traffic Signal Control with Connected Vehicles [J].
Goodall, Noah J. ;
Smith, Brian L. ;
Park, Byungkyu .
TRANSPORTATION RESEARCH RECORD, 2013, (2381) :65-72
[9]  
Grondman I., 2015, THESIS DELFT U TECHN, DOI DOI 10.4233/UUID:415-14FD-0B1B-4E18-8974-5AD61F7FE280
[10]   A Distributed Adaptive Triple-Step Nonlinear Control for a Connected Automated Vehicle Platoon With Dynamic Uncertainty [J].
Guo, Hongyan ;
Liu, Jun ;
Dai, Qikun ;
Chen, Hong ;
Wang, Yulei ;
Zhao, Wanzhong .
IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (05) :3861-3871