Cooperative Highway Work Zone Merge Control Based on Reinforcement Learning in a Connected and Automated Environment

被引：28

作者：

Ren, Tianzhu ^{[1
]}

Xie, Yuanchang ^{[2
]}

Jiang, Liming ^{[2
]}

机构：

[1] Amazon, Seattle, WA USA

[2] Univ Massachusetts, Dept Civil & Environm Engn, Lowell, MA 01854 USA

来源：

TRANSPORTATION RESEARCH RECORD | 2020年 / 2674卷 / 10期

基金：

美国国家科学基金会;

关键词：

Vehicles;

D O I：

10.1177/0361198120935873

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Given the aging infrastructure and the anticipated growing number of highway work zones in the U.S.A., it is important to investigate work zone merge control, which is critical for improving work zone safety and capacity. This paper proposes and evaluates a novel highway work zone merge control strategy based on cooperative driving behavior enabled by artificial intelligence. The proposed method assumes that all vehicles are fully automated, connected, and cooperative. It inserts two metering zones in the open lane to make space for merging vehicles in the closed lane. In addition, each vehicle in the closed lane learns how to adjust its longitudinal position optimally to find a safe gap in the open lane using an off-policy soft actor critic reinforcement learning (RL) algorithm, considering its surrounding traffic conditions. The learning results are captured in convolutional neural networks and used to control individual vehicles in the testing phase. By adding the metering zones and taking the locations, speeds, and accelerations of surrounding vehicles into account, cooperation among vehicles is implicitly considered. This RL-based model is trained and evaluated using a microscopic traffic simulator. The results show that this cooperative RL-based merge control significantly outperforms popular strategies such as late merge and early merge in terms of both mobility and safety measures. It also performs better than a strategy assuming all vehicles are equipped with cooperative adaptive cruise control.

引用

页码：363 / 374

页数：12

共 33 条

[11] Haarnoja T, 2018, PR MACH LEARN RES, V80
[12] Approximating the Kullback Leibler Divergence between Gaussian Mixture Models
Hershey, John R.
Olsen, Peder A.
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 317 - 320
[13] Howard RA., 1960, Dynamic programming and markov processes
[14] Li JW, 2017, IEEE INT SYM BROADB, P93
[15] Lillicrap J. J., 2015, COMPUTER SCI
[16] Towards an understanding of adaptive cruise control
Marsden, G
McDonald, M
Brackstone, M
[J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2001, 9 (01) : 33 - 51
[17] Mnih V, 2016, PR MACH LEARN RES, V48
[18] Human-level control through deep reinforcement learning
Mnih, Volodymyr
Kavukcuoglu, Koray
Silver, David
Rusu, Andrei A.
Veness, Joel
Bellemare, Marc G.
Graves, Alex
Riedmiller, Martin
Fidjeland, Andreas K.
Ostrovski, Georg
Petersen, Stig
Beattie, Charles
Sadik, Amir
Antonoglou, Ioannis
King, Helen
Kumaran, Dharshan
Wierstra, Daan
Legg, Shane
Hassabis, Demis
[J]. NATURE, 2015, 518 (7540) : 529 - 533
[19] Sallab A, 2016, P 30 C NEUR INF PROC
[20] Enabling the Future: Crowdsourced 3D-printed Prosthetics as a Model for Open Source Assistive Technology Innovation and Mutual Aid
Schull, Jon
[J]. ASSETS'15: PROCEEDINGS OF THE 17TH INTERNATIONAL ACM SIGACCESS CONFERENCE ON COMPUTERS & ACCESSIBILITY, 2015, : 1 - 1

← 1 2 3 4 →