Multi-Agent Constrained Policy Optimization for Conflict-Free Management of Connected Autonomous Vehicles at Unsignalized Intersections

被引：3

作者：

Zhao, Rui ^{[1
]}

Li, Yun ^{[2
]}

Gao, Fei ^{[3
]}

Gao, Zhenhai ^{[3
]}

Zhang, Tianyao ^{[3
]}

机构：

[1] Jilin Univ, Coll Automot Engn, Changchun 130025, Peoples R China

[2] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo 1138654, Japan

[3] Jilin Univ, State Key Lab Automot Simulat & Control, Changchun 130025, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2024年 / 25卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Safety; Computational efficiency; Trajectory; Autonomous vehicles; Roads; Collaboration; Vehicle dynamics; Conflict-free management; connected autono-mous vehicles; safety reinforcement learning; multi-agent constrained policy optimization; unsignalized intersections; AUTOMATED VEHICLES; SYSTEM;

D O I：

10.1109/TITS.2023.3331723

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Autonomous Intersection Management (AIM) systems present a new paradigm for conflict-free cooperation of connected autonomous vehicles (CAVs) at road intersections, the aim of which is to eliminate collisions and improve the traffic efficiency and ride comfort. Given the challenges of current centralized coordination methods in balancing high computational efficiency and robust safety assurance, this paper proposes an innovative conflict-free management scheme for CAVs at unsignalized intersections, leveraging safe multi-agent deep reinforcement learning (MADRL). Firstly, we formulate the safe MADRL problem as a constrained Markov game (CMG) and then transform the AIM problem into a CMG by carefully designing state, action, reward, and cost functions. Subsequently, we propose the Multi-Agent Constrained Policy Optimization (MACPO), specifically tailored to solve the CMG problem. MACPO incorporates safety constraints that further restrict the trust region formed by the Kullback-Leibler (KL) divergence, facilitating reinforcement learning policy updates that maximize performance while keeping constraint costs within their limit bounds. This leads us to introduce the MACPO-based AIM Algorithm. Finally, we train an AIM policy and compare its computation time, ride comfort, traffic efficiency, and safety with management schemes based on Model Predictive Control (MPC), Mixed Integer Programming (MIP), and non-safety-aware reinforcement learning. According to the results, compared with the MPC and MIP methods, our method has increased computational efficiency by 65.22 times and 731.52 times respectively, and has improved traffic efficiency by 2.41 times and 1.80 times respectively. In contrast to the non-safety awareness RL methods, our method achieves a zero collision rate for the first time, while also enhancing ride comfort, highlighting the advantages of using MACPO.

引用

页码：5374 / 5388

页数：15

共 16 条

[1] Centralized cooperative control for autonomous vehicles at unsignalized all-directional intersections: A multi-agent projection-based constrained policy optimization approach
Zhao, Rui
Wang, Kui
Li, Yun
Fan, Yuze
Gao, Fei
Gao, Zhenhai
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 267
[2] Conflict-free optimal control of connected automated vehicles at unsignalized intersections: A condition-based computational framework with constrained terminal position and speed
Xue, Yongjie
Zhang, Li
Sun, Yuxuan
Zhou, Yu
Liu, Zhiyuan
Yu, Bin
TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2025, 195
[3] Multi-Agent Deep Reinforcement Learning to Manage Connected Autonomous Vehicles at Tomorrow's Intersections
Antonio, Guillen-Perez
Maria-Dolores, Cano
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (07) : 7033 - 7043
[4] Coordination for Connected and Autonomous Vehicles at Unsignalized Intersections: An Iterative Learning-Based Collision-Free Motion Planning Method
Wang, Bowen
Gong, Xinle
Wang, Yafei
Lyu, Peiyuan
Liang, Sheng
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (03): : 5439 - 5454
[5] Multi-Agent Probabilistic Ensembles With Trajectory Sampling for Connected Autonomous Vehicles
Wen, Ruoqi
Huang, Jiahao
Li, Rongpeng
Ding, Guoru
Zhao, Zhifeng
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (11) : 16076 - 16091
[6] A Survey on the Use of the Multi-agent Paradigm in Coordination of Connected and Autonomous Vehicles
Cabri, Giacomo
Leonardi, Letizia
Rotonda, Enzo
INTELLIGENT DISTRIBUTED COMPUTING XV, IDC 2022, 2023, 1089 : 118 - 124
[7] Multi-Agent Intersection Management for Connected Vehicles using an Optimal Scheduling Approach
Jin, Qiu
Wu, Guoyuan
Boriboonsomsin, Kanok
Barth, Matthew
2012 INTERNATIONAL CONFERENCE ON CONNECTED VEHICLES AND EXPO (ICCVE), 2012, : 185 - 190
[8] Toward edge-computing-enabled collision-free scheduling management autonomous vehicles at unsignalized intersections
Lu, Ziyi
Wu, Tianxiong
Su, Jinshan
Xu, Yunting
Qian, Bo
Zhang, Tianqi
Zhou, Haibo
DIGITAL COMMUNICATIONS AND NETWORKS, 2024, 10 (06) : 1600 - 1610
[9] Multi-Agent Reinforcement Learning for Traffic Flow Management of Autonomous Vehicles
Mushtaq, Anum
Ul Haq, Irfan
Sarwar, Muhammad Azeem
Khan, Asifullah
Khalil, Wajeeha
Mughal, Muhammad Abid
SENSORS, 2023, 23 (05)
[10] A Multi-Agent Reinforcement Learning Approach for Safe and Efficient Behavior Planning of Connected Autonomous Vehicles
Han, Songyang
Zhou, Shanglin
Wang, Jiangwei
Pepin, Lynn
Ding, Caiwen
Fu, Jie
Miao, Fei
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (05) : 3654 - 3670

← 1 2 →