Cooperative Highway Work Zone Merge Control Based on Reinforcement Learning in a Connected and Automated Environment

被引:28
作者
Ren, Tianzhu [1 ]
Xie, Yuanchang [2 ]
Jiang, Liming [2 ]
机构
[1] Amazon, Seattle, WA USA
[2] Univ Massachusetts, Dept Civil & Environm Engn, Lowell, MA 01854 USA
基金
美国国家科学基金会;
关键词
Vehicles;
D O I
10.1177/0361198120935873
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Given the aging infrastructure and the anticipated growing number of highway work zones in the U.S.A., it is important to investigate work zone merge control, which is critical for improving work zone safety and capacity. This paper proposes and evaluates a novel highway work zone merge control strategy based on cooperative driving behavior enabled by artificial intelligence. The proposed method assumes that all vehicles are fully automated, connected, and cooperative. It inserts two metering zones in the open lane to make space for merging vehicles in the closed lane. In addition, each vehicle in the closed lane learns how to adjust its longitudinal position optimally to find a safe gap in the open lane using an off-policy soft actor critic reinforcement learning (RL) algorithm, considering its surrounding traffic conditions. The learning results are captured in convolutional neural networks and used to control individual vehicles in the testing phase. By adding the metering zones and taking the locations, speeds, and accelerations of surrounding vehicles into account, cooperation among vehicles is implicitly considered. This RL-based model is trained and evaluated using a microscopic traffic simulator. The results show that this cooperative RL-based merge control significantly outperforms popular strategies such as late merge and early merge in terms of both mobility and safety measures. It also performs better than a strategy assuming all vehicles are equipped with cooperative adaptive cruise control.
引用
收藏
页码:363 / 374
页数:12
相关论文
共 33 条
  • [11] Haarnoja T, 2018, PR MACH LEARN RES, V80
  • [12] Approximating the Kullback Leibler Divergence between Gaussian Mixture Models
    Hershey, John R.
    Olsen, Peder A.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 317 - 320
  • [13] Howard RA., 1960, Dynamic programming and markov processes
  • [14] Li JW, 2017, IEEE INT SYM BROADB, P93
  • [15] Lillicrap J. J., 2015, COMPUTER SCI
  • [16] Towards an understanding of adaptive cruise control
    Marsden, G
    McDonald, M
    Brackstone, M
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2001, 9 (01) : 33 - 51
  • [17] Mnih V, 2016, PR MACH LEARN RES, V48
  • [18] Human-level control through deep reinforcement learning
    Mnih, Volodymyr
    Kavukcuoglu, Koray
    Silver, David
    Rusu, Andrei A.
    Veness, Joel
    Bellemare, Marc G.
    Graves, Alex
    Riedmiller, Martin
    Fidjeland, Andreas K.
    Ostrovski, Georg
    Petersen, Stig
    Beattie, Charles
    Sadik, Amir
    Antonoglou, Ioannis
    King, Helen
    Kumaran, Dharshan
    Wierstra, Daan
    Legg, Shane
    Hassabis, Demis
    [J]. NATURE, 2015, 518 (7540) : 529 - 533
  • [19] Sallab A, 2016, P 30 C NEUR INF PROC
  • [20] Enabling the Future: Crowdsourced 3D-printed Prosthetics as a Model for Open Source Assistive Technology Innovation and Mutual Aid
    Schull, Jon
    [J]. ASSETS'15: PROCEEDINGS OF THE 17TH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS & ACCESSIBILITY, 2015, : 1 - 1