Heuristic-Based Multi-Agent Deep Reinforcement Learning Approach for Coordinating Connected and Automated Vehicles at Non-Signalized Intersection

被引：0

作者：

Guo, Zihan ^{[1
,2
]}

Wu, Yan ^{[1
,2
]}

Wang, Lifang ^{[1
,2
]}

Zhang, Junzhi ^{[3
]}

机构：

[1] Chinese Acad Sci, Inst Elect Engn, Key Lab High Dens Electromagnet Power & Syst, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[3] Tsinghua Univ, Dept Automot Engn, Key Lab Automot Safety & Energy, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2024年 / 25卷 / 11期

关键词：

Heuristic algorithms; Deep reinforcement learning; Autonomous vehicles; Training; Delays; Transfer learning; Q-learning; Optimization; Merging; Game theory; Non-signalized intersection management; multi-agent deep reinforcement learning; zero-shot generalization; communication latency;

D O I：

10.1109/TITS.2024.3407760

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

One typical application of connected and automated vehicles (CAVs) is to coordinate multiple CAVs at a non-signalized intersection in mixed traffic, and it may take advantage of multi-agent deep reinforcement learning (MDRL) approaches to improve the overall coordination efficiency. This study proposes a heuristic-based MDRL algorithm (H-QMIX) developed based on a value-based MDRL algorithm, QMIX. This algorithm incorporates a heuristic-based action mask module to guide CAVs efficiently and safely through intersections, composed of a stimulative passing sequence and safety restrictions on CAVs' action space in the junction area. Compared with other MDRL algorithms (e.g., IPPO, QMIX), the H-QMIX algorithm demonstrates improved training performance in terms of safety and efficiency in two case studies, where the first requires all CAVs to affix their routes, and another allows CAVs to choose random routes. Concerning the model's generalization ability, the trained models with the maximal episodic return are then transferred to a more practical scenario with a certain vehicle-to-vehicle (V2V) communication delay in a zero-shot manner. The simulation results illustrate that H-QMIX is robust to a certain communication delay. The code for this paper is available at: https://github.com/flammingRaven/heuristic_based_qmix.

引用

页码：16235 / 16248

页数：14

共 50 条

[21] Multi-Agent Deep Reinforcement Learning for content caching within the Internet of Vehicles
Knari, Anas
Derfouf, Mostapha
Koulali, Mohammed-Amine
Khoumsi, Ahmed
AD HOC NETWORKS, 2024, 152
[22] Deep Reinforcement Learning Based Left-Turn Connected and Automated Vehicle Control at Signalized Intersection in Vehicle-to-Infrastructure Environment
Chen, Juan
Xue, Zhengxuan
Fan, Daiqian
INFORMATION, 2020, 11 (02)
[23] Distributed Deep Multi-Agent Reinforcement Learning for Cooperative Edge Caching in Internet-of-Vehicles
Zhou, Huan
Jiang, Kai
He, Shibo
Min, Geyong
Wu, Jie
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2023, 22 (12) : 9595 - 9609
[24] Formation Control of Multi-agent Based on Deep Reinforcement Learning
Pan, Chao
Nian, Xiaohong
Dai, Xunhua
Wang, Haibo
Xiong, Hongyun
PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 1149 - 1159
[25] Communicate with Traffic Lights and Vehicles Based on Multi-Agent Reinforcement Learning
Wu, Qiang
Zhi, Peng
Wei, Yongqiang
Zhang, Liang
Wu, Jianqing
Zhou, Qingguo
Zhou, Qiang
Gao, Pengfei
PROCEEDINGS OF THE 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2021, : 843 - 848
[26] Multi-Agent Deep Reinforcement Learning-Based Resource Allocation for Cognitive Radio Networks
Mei, Ruru
Wang, Zhugang
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (03) : 4744 - 4757
[27] Multi-Agent Deep Reinforcement Learning Based Downlink Beamforming in Heterogeneous Networks
Zhang, Zitian
Hou, Jinbo
Chu, Xiaoli
Zhou, Haibo
Wei, Guiyi
Zhang, Jie
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2023, 22 (06) : 4247 - 4263
[28] An Incremental Approach for Multi-Agent Deep Reinforcement Learning for Multicriteria Missions
Cysne, Nicholas Scharan
Ribeiro, Carlos Henrique Costa
Ghedini, Cinara Guellner
2023 EUROPEAN CONTROL CONFERENCE, ECC, 2023,
[29] Multi-agent deep reinforcement learning approach for EV charging scheduling in a smart grid
Park, Keonwoo
Moon, Ilkyeong
APPLIED ENERGY, 2022, 328
[30] Dynamic power allocation in IIoT based on multi-agent deep reinforcement learning
Li, Fenglei
Liu, Zhixin
Zhang, Xinzhe
Yang, Yi
NEUROCOMPUTING, 2022, 505 : 10 - 18

← 1 2 3 4 5 →