Multi-Agent Probabilistic Ensembles With Trajectory Sampling for Connected Autonomous Vehicles

被引：1

作者：

Wen, Ruoqi ^{[1
]}

Huang, Jiahao ^{[1
]}

Li, Rongpeng ^{[1
]}

Ding, Guoru ^{[2
]}

Zhao, Zhifeng ^{[1
,3
]}

机构：

[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310058, Peoples R China

[2] Army Engn Univ PLA, Coll Commun & Engn, Nanjing 210007, Peoples R China

[3] Zhejiang Lab, Hangzhou 311121, Peoples R China

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2024年 / 73卷 / 11期

关键词：

Reinforcement learning; Decision making; Data models; Uncertainty; Probabilistic logic; Vehicle dynamics; Trajectory; Autonomous vehicle control; multi-agent model-based reinforcement learning; probabilistic ensembles with trajectory sampling; VALUE DECOMPOSITION; REINFORCEMENT; COMPLEXITY;

D O I：

10.1109/TVT.2024.3424191

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Connected Autonomous Vehicles (CAVs) have attracted significant attention in recent years and Reinforcement Learning (RL) has shown remarkable performance in improving the autonomy of vehicles. In that regard, Model-Based RL (MBRL) manifests itself in sample-efficient learning, but the asymptotic performance of MBRL might lag behind the state-of-the-art Model-Free RL (MFRL) algorithms. Furthermore, most studies for CAVs are limited to the decision-making of a single vehicle only, thus underscoring the performance due to the absence of communications. In this study, we try to address the decision-making problem of multiple CAVs with limited communications and propose a decentralized Multi-Agent Probabilistic Ensembles (PEs) with Trajectory Sampling (TS) algorithm namely MA-PETS. In particular, to better capture the uncertainty of the unknown environment, MA-PETS leverages PE neural networks to learn from communicated samples among neighboring CAVs. Afterward, MA-PETS capably develops TS-based model-predictive control for decision-making. On this basis, we derive the multi-agent group regret bound affected by the number of agents within the communication range and mathematically validate that incorporating effective information exchange among agents into the multi-agent learning scheme contributes to reducing the group regret bound in the worst case. Finally, we empirically demonstrate the superiority of MA-PETS in terms of the sample efficiency comparable to MFRL.

引用

页码：16076 / 16091

页数：16

共 50 条

[31] Multi-Agent Reinforcement Learning for Efficient Resource Allocation in Internet of Vehicles
Wang, Jun-Han
He, He
Cha, Jaesang
Jeong, Incheol
Ahn, Chang-Jun
ELECTRONICS, 2025, 14 (01):
[32] Multi-Agent DDPG Based Electric Vehicles Charging Station Recommendation
Bachiri, Khalil
Yahyaouy, Ali
Gualous, Hamid
Malek, Maria
Bennani, Younes
Makany, Philippe
Rogovschi, Nicoleta
ENERGIES, 2023, 16 (16)
[33] Temporal sampling annealing schemes for receding horizon multi-agent planning
Ma, Aaron
Ouimet, Mike
Cortes, Jorge
ROBOTICS AND AUTONOMOUS SYSTEMS, 2021, 143
[34] Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood Structures
Verstraeten, Timothy
Bargiacchi, Eugenio
Libin, Pieter J. K.
Helsen, Jan
Roijers, Diederik M.
Nowe, Ann
SCIENTIFIC REPORTS, 2020, 10 (01)
[35] Distributed UAV-BSs Trajectory Optimization for User-Level Fair Communication Service With Multi-Agent Deep Reinforcement Learning
Qin, Zhenquan
Liu, Zhonghao
Han, Guangjie
Lin, Chuan
Guo, Linlin
Xie, Ling
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (12) : 12290 - 12301
[36] Receding Horizon Control Using Graph Search for Multi-Agent Trajectory Planning
Scheffe, Patrick
Pedrosa, Matheus V. A.
Flasskamp, Kathrin
Alrifaee, Bassam
IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2023, 31 (03) : 1092 - 1105
[37] Positive Consensus for Heterogeneous Multi-Agent Systems Over Jointly Connected Graphs
Li, Ruonan
Zhang, Yichen
Tang, Yutao
Li, Shurong
IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 2451 - 2456
[38] Distributed Multi-Agent Meta Learning for Trajectory Design in Wireless Drone Networks
Hu, Ye
Chen, Mingzhe
Saad, Walid
Poor, H. Vincent
Cui, Shuguang
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (10) : 3177 - 3192
[39] Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion
Zhang, Bo-Kun
Hu, Bin
Chen, Long
Zhang, Ding-Xue
Cheng, Xin-Ming
Guan, Zhi-Hong
PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3352 - 3357
[40] SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving
Zhou, Ming
Luo, Jun
Villella, Julian
Yang, Yaodong
Rusu, David
Miao, Jiayu
Zhang, Weinan
Alban, Montgomery
Fadakar, Iman
Chen, Zheng
Huang, Aurora C.
Wen, Ying
Hassanzadeh, Kimia
Graves, Daniel
Chen, Dong
Zhu, Zhengbang
Nhat Nguyen
Elsayed, Mohamed
Shao, Kun
Ahilan, Sanjeevan
Zhang, Baokuan
Wu, Jiannan
Fu, Zhengang
Rezaee, Kasra
Yadmellat, Peyman
Rohani, Mohsen
Nieves, Nicolas Perez
Ni, Yihan
Banijamali, Seyedershad
Cowen-Rivers, Alexander I.
Tian, Zheng
Palenicek, Daniel
Ammar, Haitham Bou
Zhang, Hongbo
Liu, Wulong
Hao, Jianye
Wang, Jun
CONFERENCE ON ROBOT LEARNING, VOL 155, 2020, 155 : 264 - 285

← 1 2 3 4 5 →