Multi-Agent Probabilistic Ensembles With Trajectory Sampling for Connected Autonomous Vehicles

被引:1
作者
Wen, Ruoqi [1 ]
Huang, Jiahao [1 ]
Li, Rongpeng [1 ]
Ding, Guoru [2 ]
Zhao, Zhifeng [1 ,3 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310058, Peoples R China
[2] Army Engn Univ PLA, Coll Commun & Engn, Nanjing 210007, Peoples R China
[3] Zhejiang Lab, Hangzhou 311121, Peoples R China
关键词
Reinforcement learning; Decision making; Data models; Uncertainty; Probabilistic logic; Vehicle dynamics; Trajectory; Autonomous vehicle control; multi-agent model-based reinforcement learning; probabilistic ensembles with trajectory sampling; VALUE DECOMPOSITION; REINFORCEMENT; COMPLEXITY;
D O I
10.1109/TVT.2024.3424191
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Connected Autonomous Vehicles (CAVs) have attracted significant attention in recent years and Reinforcement Learning (RL) has shown remarkable performance in improving the autonomy of vehicles. In that regard, Model-Based RL (MBRL) manifests itself in sample-efficient learning, but the asymptotic performance of MBRL might lag behind the state-of-the-art Model-Free RL (MFRL) algorithms. Furthermore, most studies for CAVs are limited to the decision-making of a single vehicle only, thus underscoring the performance due to the absence of communications. In this study, we try to address the decision-making problem of multiple CAVs with limited communications and propose a decentralized Multi-Agent Probabilistic Ensembles (PEs) with Trajectory Sampling (TS) algorithm namely MA-PETS. In particular, to better capture the uncertainty of the unknown environment, MA-PETS leverages PE neural networks to learn from communicated samples among neighboring CAVs. Afterward, MA-PETS capably develops TS-based model-predictive control for decision-making. On this basis, we derive the multi-agent group regret bound affected by the number of agents within the communication range and mathematically validate that incorporating effective information exchange among agents into the multi-agent learning scheme contributes to reducing the group regret bound in the worst case. Finally, we empirically demonstrate the superiority of MA-PETS in terms of the sample efficiency comparable to MFRL.
引用
收藏
页码:16076 / 16091
页数:16
相关论文
共 50 条
  • [31] Multi-Agent Reinforcement Learning for Efficient Resource Allocation in Internet of Vehicles
    Wang, Jun-Han
    He, He
    Cha, Jaesang
    Jeong, Incheol
    Ahn, Chang-Jun
    ELECTRONICS, 2025, 14 (01):
  • [32] Multi-Agent DDPG Based Electric Vehicles Charging Station Recommendation
    Bachiri, Khalil
    Yahyaouy, Ali
    Gualous, Hamid
    Malek, Maria
    Bennani, Younes
    Makany, Philippe
    Rogovschi, Nicoleta
    ENERGIES, 2023, 16 (16)
  • [33] Temporal sampling annealing schemes for receding horizon multi-agent planning
    Ma, Aaron
    Ouimet, Mike
    Cortes, Jorge
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2021, 143
  • [34] Multi-Agent Thompson Sampling for Bandit Applications with Sparse Neighbourhood Structures
    Verstraeten, Timothy
    Bargiacchi, Eugenio
    Libin, Pieter J. K.
    Helsen, Jan
    Roijers, Diederik M.
    Nowe, Ann
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [35] Distributed UAV-BSs Trajectory Optimization for User-Level Fair Communication Service With Multi-Agent Deep Reinforcement Learning
    Qin, Zhenquan
    Liu, Zhonghao
    Han, Guangjie
    Lin, Chuan
    Guo, Linlin
    Xie, Ling
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (12) : 12290 - 12301
  • [36] Receding Horizon Control Using Graph Search for Multi-Agent Trajectory Planning
    Scheffe, Patrick
    Pedrosa, Matheus V. A.
    Flasskamp, Kathrin
    Alrifaee, Bassam
    IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2023, 31 (03) : 1092 - 1105
  • [37] Positive Consensus for Heterogeneous Multi-Agent Systems Over Jointly Connected Graphs
    Li, Ruonan
    Zhang, Yichen
    Tang, Yutao
    Li, Shurong
    IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 2451 - 2456
  • [38] Distributed Multi-Agent Meta Learning for Trajectory Design in Wireless Drone Networks
    Hu, Ye
    Chen, Mingzhe
    Saad, Walid
    Poor, H. Vincent
    Cui, Shuguang
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2021, 39 (10) : 3177 - 3192
  • [39] Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion
    Zhang, Bo-Kun
    Hu, Bin
    Chen, Long
    Zhang, Ding-Xue
    Cheng, Xin-Ming
    Guan, Zhi-Hong
    PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3352 - 3357
  • [40] SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving
    Zhou, Ming
    Luo, Jun
    Villella, Julian
    Yang, Yaodong
    Rusu, David
    Miao, Jiayu
    Zhang, Weinan
    Alban, Montgomery
    Fadakar, Iman
    Chen, Zheng
    Huang, Aurora C.
    Wen, Ying
    Hassanzadeh, Kimia
    Graves, Daniel
    Chen, Dong
    Zhu, Zhengbang
    Nhat Nguyen
    Elsayed, Mohamed
    Shao, Kun
    Ahilan, Sanjeevan
    Zhang, Baokuan
    Wu, Jiannan
    Fu, Zhengang
    Rezaee, Kasra
    Yadmellat, Peyman
    Rohani, Mohsen
    Nieves, Nicolas Perez
    Ni, Yihan
    Banijamali, Seyedershad
    Cowen-Rivers, Alexander I.
    Tian, Zheng
    Palenicek, Daniel
    Ammar, Haitham Bou
    Zhang, Hongbo
    Liu, Wulong
    Hao, Jianye
    Wang, Jun
    CONFERENCE ON ROBOT LEARNING, VOL 155, 2020, 155 : 264 - 285