Deep Multi-Agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic

被引：81

作者：

Chen, Dong ^{[1
]}

Hajidavalloo, Mohammad R. ^{[1
]}

Li, Zhaojian ^{[1
]}

Chen, Kaian ^{[1
]}

Wang, Yongqiang ^{[2
]}

Jiang, Longsheng ^{[3
]}

Wang, Yue ^{[3
]}

机构：

[1] Michigan State Univ, Dept Mech Engn, Lansing, MI 48824 USA

[2] Clemson Univ, Dept Elect & Comp Engn, Clemson, SC 29630 USA

[3] Clemson Univ, Dept Mech Engn, Clemson, SC 29634 USA

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2023年 / 24卷 / 11期

基金：

美国国家科学基金会;

关键词：

Multi-agent deep reinforcement learning; connected autonomous vehicles; safety enhancement; on-ramp merging; MODEL;

D O I：

10.1109/TITS.2023.3285442

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

On-ramp merging is a challenging task for autonomous vehicles (AVs), especially in mixed traffic where AVs coexist with human-driven vehicles (HDVs). In this paper, we formulate the mixed-traffic highway on-ramp merging problem as a multi-agent reinforcement learning (MARL) problem, where the AVs (on both merge lane and through lane) collaboratively learn a policy to adapt to HDVs to maximize the traffic throughput. We develop an efficient and scalable MARL framework that can be used in dynamic traffic where the communication topology could be time-varying. Parameter sharing and local rewards are exploited to foster inter-agent cooperation while achieving great scalability. An action masking scheme is employed to improve learning efficiency by filtering out invalid/unsafe actions at each step. In addition, a novel priority-based safety supervisor is developed to significantly reduce collision rate and greatly expedite the training process. A gym-like simulation environment is developed and open-sourced with three different levels of traffic densities. We exploit curriculum learning to efficiently learn harder tasks from trained models under simpler settings. Comprehensive experimental results show the proposed MARL framework consistently outperforms several state-of-the-art benchmarks.

引用

页码：11623 / 11638

页数：16

共 71 条

[1]

Alshiekh M, 2018, AAAI CONF ARTIF INTE, P2669

[2]

[Anonymous], 2001, POL GEOM DES HIGHW S

[3]

[Anonymous], FUT DRIV

[4]

Apollo, AP OP PLATF

[5]

Ayres TJ, 2001, 2001 IEEE INTELLIGENT TRANSPORTATION SYSTEMS - PROCEEDINGS, P826, DOI 10.1109/ITSC.2001.948767

[6]

Bagnell D., 2005, P ADV NEUR INF, V18, P91

[7]

Berner C., 2019, Dota 2 with large scale deep reinforcement learning

[8] Lane Change and Merge Maneuvers for Connected and Automated Vehicles: A Survey [J].

Bevly, David ;

Cao, Xiaolong ;

Gordon, Mikhail ;

Ozbilgin, Guchan ;

Kari, David ;

Nelson, Brently ;

Woodruff, Jonathan ;

Barth, Matthew ;

Murray, Chase ;

Kurt, Arda ;

Redmill, Keith ;

Ozguner, Umit .

IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2016, 1 (01) :105-120

[9]

Bhalla Sushrut, 2020, Advances in Artificial Intelligence. 33rd Canadian Conference on Artificial Intelligence, Canadian AI 2020. Proceedings. Lecture Notes in Artificial Intelligence. Subseries of Lecture Notes in Computer Science (LNAI 12109), P67, DOI 10.1007/978-3-030-47358-7_7

[10]

Bouton M, 2019, IEEE INT C INTELL TR, P3441, DOI [10.1109/ITSC.2019.8916924, 10.1109/itsc.2019.8916924]

← 1 2 3 4 5 6 7 8 →