Centralized Cooperation for Connected and Automated Vehicles at Intersections by Proximal Policy Optimization

被引:99
作者
Guan, Yang [1 ,2 ]
Ren, Yangang [1 ,2 ]
Li, Shengbo Eben [1 ,2 ]
Sun, Qi [1 ,2 ]
Luo, Laiquan [1 ,2 ]
Li, Keqiang [1 ,2 ]
机构
[1] Tsinghua Univ, State Key Lab Automot Safety & Energy, Sch Vehicle & Mobil, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Ctr Intelligent Connected Vehicles & Transportat, Beijing 100084, Peoples R China
关键词
Optimization; Computational modeling; Acceleration; Linear programming; Real-time systems; Safety; Trajectory; Centralized coordination method; connected and automated vehicle; reinforcement learning; traffic intersection;
D O I
10.1109/TVT.2020.3026111
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Connected vehicles will change the modes of future transportation management and organization, especially at an intersection without traffic light. Centralized coordination methods globally coordinate vehicles approaching the intersection from all sections by considering their states altogether. However, they need substantial computation resources since they own a centralized controller to optimize the trajectories for all approaching vehicles in real-time. In this paper, we propose a centralized coordination scheme of automated vehicles at an intersection without traffic signals using reinforcement learning (RL) to address low computation efficiency suffered by current centralized coordination methods. We first propose an RL training algorithm, model accelerated proximal policy optimization (MA-PPO), which incorporates a prior model into proximal policy optimization (PPO) algorithm to accelerate the learning process in terms of sample efficiency. Then we present the design of state, action and reward to formulate centralized coordination as an RL problem. Finally, we train a coordinate policy in a simulation setting and compare computing time and traffic efficiency with a coordination scheme based on model predictive control (MPC) method. Results show that our method spends only 1/400 of the computing time of MPC and increase the efficiency of the intersection by 4.5 times.
引用
收藏
页码:12597 / 12608
页数:12
相关论文
共 29 条
  • [1] Quality-of-Experience-Oriented Autonomous Intersection Control in Vehicular Networks
    Dai, Penglin
    Liu, Kai
    Zhuge, Qingfeng
    Sha, Edwin H. -M.
    Lee, Victor Chung Sing
    Son, Sang Hyuk
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2016, 17 (07) : 1956 - 1967
  • [2] Deisenroth M., 2011, PILCO MODEL BASED DA, P465
  • [3] Dosovitskiy A., 2017, C ROB LEARN, P1, DOI DOI 10.48550/ARXIV.1711.03938
  • [4] Hierarchical reinforcement learning for self-driving decision-making without reliance on labelled driving data
    Duan, Jingliang
    Eben Li, Shengbo
    Guan, Yang
    Sun, Qi
    Cheng, Bo
    [J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2020, 14 (05) : 297 - 305
  • [5] Feinberg Vladimir, 2018, MODEL BASED VALUE ES
  • [6] A real-time adaptive signal control in a connected vehicle environment
    Feng, Yiheng
    Head, K. Larry
    Khoshmagham, Shayan
    Zamanipour, Mehdi
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2015, 55 : 460 - 473
  • [7] Distributed Adaptive Sliding Mode Control of Vehicular Platoon With Uncertain Interaction Topology
    Gao, Feng
    Hu, Xiaosong
    Li, Shengbo Eben
    Li, Keqiang
    Sun, Qi
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2018, 65 (08) : 6352 - 6361
  • [8] Traffic Signal Control with Connected Vehicles
    Goodall, Noah J.
    Smith, Brian L.
    Park, Byungkyu
    [J]. TRANSPORTATION RESEARCH RECORD, 2013, (2381) : 65 - 72
  • [9] Grondman I., 2015, THESIS DELFT U TECHN, DOI DOI 10.4233/UUID:415-14FD-0B1B-4E18-8974-5AD61F7FE280
  • [10] A Distributed Adaptive Triple-Step Nonlinear Control for a Connected Automated Vehicle Platoon With Dynamic Uncertainty
    Guo, Hongyan
    Liu, Jun
    Dai, Qikun
    Chen, Hong
    Wang, Yulei
    Zhao, Wanzhong
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (05): : 3861 - 3871