Multi-Agent Probabilistic Ensembles With Trajectory Sampling for Connected Autonomous Vehicles

被引:1
作者
Wen, Ruoqi [1 ]
Huang, Jiahao [1 ]
Li, Rongpeng [1 ]
Ding, Guoru [2 ]
Zhao, Zhifeng [1 ,3 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310058, Peoples R China
[2] Army Engn Univ PLA, Coll Commun & Engn, Nanjing 210007, Peoples R China
[3] Zhejiang Lab, Hangzhou 311121, Peoples R China
关键词
Reinforcement learning; Decision making; Data models; Uncertainty; Probabilistic logic; Vehicle dynamics; Trajectory; Autonomous vehicle control; multi-agent model-based reinforcement learning; probabilistic ensembles with trajectory sampling; VALUE DECOMPOSITION; REINFORCEMENT; COMPLEXITY;
D O I
10.1109/TVT.2024.3424191
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Connected Autonomous Vehicles (CAVs) have attracted significant attention in recent years and Reinforcement Learning (RL) has shown remarkable performance in improving the autonomy of vehicles. In that regard, Model-Based RL (MBRL) manifests itself in sample-efficient learning, but the asymptotic performance of MBRL might lag behind the state-of-the-art Model-Free RL (MFRL) algorithms. Furthermore, most studies for CAVs are limited to the decision-making of a single vehicle only, thus underscoring the performance due to the absence of communications. In this study, we try to address the decision-making problem of multiple CAVs with limited communications and propose a decentralized Multi-Agent Probabilistic Ensembles (PEs) with Trajectory Sampling (TS) algorithm namely MA-PETS. In particular, to better capture the uncertainty of the unknown environment, MA-PETS leverages PE neural networks to learn from communicated samples among neighboring CAVs. Afterward, MA-PETS capably develops TS-based model-predictive control for decision-making. On this basis, we derive the multi-agent group regret bound affected by the number of agents within the communication range and mathematically validate that incorporating effective information exchange among agents into the multi-agent learning scheme contributes to reducing the group regret bound in the worst case. Finally, we empirically demonstrate the superiority of MA-PETS in terms of the sample efficiency comparable to MFRL.
引用
收藏
页码:16076 / 16091
页数:16
相关论文
共 50 条
  • [41] Correctness-guaranteed strategy synthesis and compression for multi-agent autonomous systems
    Gu, Rong
    Jensen, Peter G.
    Seceleanu, Cristina
    Enoiu, Eduard
    Lundqvist, Kristina
    SCIENCE OF COMPUTER PROGRAMMING, 2022, 224
  • [42] Multi-Agent Reinforcement Learning for Side-by-Side Navigation of Autonomous Wheelchairs
    Fonseca, Tiago
    Leao, Goncalo
    Ferreira, Luis Lino
    Sousa, Armando
    Severino, Ricardo
    Reis, Luis Paulo
    2024 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS, ICARSC, 2024, : 138 - 143
  • [43] SMARPchain: A Smart Marker Based Reputational Probabilistic Blockchain for Multi-agent Systems
    Huang, Chin-Tser
    Njilla, Laurent
    Sharp, Matthew
    Geng, Tieming
    UBIQUITOUS SECURITY, UBISEC 2023, 2024, 2034 : 436 - 449
  • [44] Critical Trajectory Point Planning for Connected and Autonomous Vehicles on Freeway On-Ramps Under Mixed Traffic Environment
    Hu, Yuehai
    Yu, Chunhui
    Su, Zicheng
    Ma, Wanjing
    Chen, Zixuan
    Hou, Jinquan
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (12) : 18156 - 18172
  • [45] Online longitudinal trajectory planning for connected and autonomous vehicles in mixed traffic flow with deep reinforcement learning approach
    Cheng, Yanqiu
    Hu, Xianbiao
    Chen, Kuanmin
    Yu, Xinlian
    Luo, Yulong
    JOURNAL OF INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 27 (03) : 396 - 410
  • [46] Online learning-based model predictive trajectory control for connected and autonomous vehicles: Modeling and physical tests
    Li, Qianwen
    Zhang, Peng
    Yao, Handong
    Chen, Zhiwei
    Li, Xiaopeng
    JOURNAL OF INTELLIGENT AND CONNECTED VEHICLES, 2024, 7 (02) : 86 - 96
  • [47] Multi-Agent Deep Reinforcement Learning for Trajectory Design and Power Allocation in Multi-UAV Networks
    Zhao, Nan
    Liu, Zehua
    Cheng, Yiqiang
    IEEE ACCESS, 2020, 8 : 139670 - 139679
  • [48] Eco-Driving System for Connected Automated Vehicles: Multi-Objective Trajectory Optimization
    Yang, Xianfeng Terry
    Huang, Ke
    Zhang, Zhehao
    Zhang, Zhao Alan
    Lin, Fang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (12) : 7837 - 7849
  • [49] Multi-Agent Proximal Policy Optimization-Based Dynamic Client Selection for Federated AI in 6G-Oriented Internet of Vehicles
    Yu, Tianqi
    Wang, Xianbin
    Hu, Jianling
    Yang, Jianfeng
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2024, 73 (09) : 13611 - 13624
  • [50] DPDRF: Dynamic Predictive Driving Risk Field Based on Multi-Agent Trajectory Prediction and Digital Twins System
    Liu, Jianhang
    Sheng, Xizhao
    Tan, Lizhuang
    Zhang, Wei
    Zhang, Peiying
    Liu, Kai
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2025, 74 (03) : 3651 - 3665