Optimistic sequential multi-agent reinforcement learning with motivational communication

被引:1
|
作者
Huang, Anqi [1 ]
Wang, Yongli [1 ]
Zhou, Xiaoliang [1 ]
Zou, Haochen [1 ]
Dong, Xu [1 ]
Che, Xun [1 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
基金
中国国家自然科学基金;
关键词
Multi-agent reinforcement learning; Policy gradient; Motivational communication; Reinforcement learning; Multi-agent system;
D O I
10.1016/j.neunet.2024.106547
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Centralized Training with Decentralized Execution (CTDE) is a prevalent paradigm in the field of fully cooperative Multi-Agent Reinforcement Learning (MARL). Existing algorithms often encounter two major problems: independent strategies tend to underestimate the potential value of actions, leading to the convergence on sub-optimal Nash Equilibria (NE); some communication paradigms introduce added complexity to the learning process, complicating the focus on the essential elements of the messages. To address these challenges, we propose a novel method called O ptimistic S equential S oft Actor Critic with M otivational C ommunication (OSSMC). The key idea of OSSMC is to utilize a greedy-driven approach to explore the potential value of individual policies, named optimistic Q-values, which serve as an upper bound for the Q-value of the current policy. We then integrate a sequential update mechanism with optimistic Q-value for agents, aiming to ensure monotonic improvement in the joint policy optimization process. Moreover, we establish motivational communication modules for each agent to disseminate motivational messages to promote cooperative behaviors. Finally, we employ a value regularization strategy from the Soft Actor Critic (SAC) method to maximize entropy and improve exploration capabilities. The performance of OSSMC was rigorously evaluated against a series of challenging benchmark sets. Empirical results demonstrate that OSSMC not only surpasses current baseline algorithms but also exhibits a more rapid convergence rate.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] A sequential multi-agent reinforcement learning framework for different action spaces
    Tian, Shucong
    Yang, Meng
    Xiong, Rongling
    He, Xingxing
    Rajasegarar, Sutharshan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [22] Deep Multi-Agent Reinforcement Learning With Minimal Cross-Agent Communication for SFC Partitioning
    Pentelas, Angelos
    De Vleeschauwer, Danny
    Chang, Chia-Yu
    De Schepper, Koen
    Papadimitriou, Panagiotis
    IEEE ACCESS, 2023, 11 : 40384 - 40398
  • [23] SCM network with multi-agent reinforcement learning
    Zhao, Gang
    Sun, Ruoying
    FIFTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS 1-3, 2006, : 1294 - 1300
  • [24] Reinforcement Learning with Quantitative Verification for Assured Multi-Agent Policies
    Riley, Joshua
    Calinescu, Radu
    Paterson, Colin
    Kudenko, Daniel
    Banks, Alec
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2021, : 237 - 245
  • [25] WRFMR: A Multi-Agent Reinforcement Learning Method for Cooperative Tasks
    Liu, Hui
    Zhang, Zhen
    Wang, Dongqing
    IEEE ACCESS, 2020, 8 : 216320 - 216331
  • [26] Reinforcement learning of multi-agent communicative acts
    Hoet S.
    Sabouret N.
    Revue d'Intelligence Artificielle, 2010, 24 (02) : 159 - 188
  • [27] HyperComm: Hypergraph-based communication in multi-agent reinforcement learning
    Zhu, Tianyu
    Shi, Xinli
    Xu, Xiangping
    Gui, Jie
    Cao, Jinde
    NEURAL NETWORKS, 2024, 178
  • [28] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
    Chen, Hao
    Yang, Guangkai
    Zhang, Junge
    Yin, Qiyue
    Huang, Kaiqi
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [29] Multi-agent reinforcement learning: A survey
    Busoniu, Lucian
    Babuska, Robert
    De Schutter, Bart
    2006 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1- 5, 2006, : 1133 - +
  • [30] A multi-agent reinforcement learning framework for cross-domain sequential recommendation
    Liu, Huiting
    Wei, Junyi
    Zhu, Kaiwen
    Li, Peipei
    Zhao, Peng
    Wu, Xindong
    NEURAL NETWORKS, 2025, 185