Consistent epistemic planning for multiagent deep reinforcement learning

被引:1
|
作者
Wu, Peiliang [1 ,2 ]
Luo, Shicheng [1 ,2 ]
Tian, Liqiang [1 ,2 ]
Mao, Bingyi [1 ,2 ]
Chen, Wenbai [3 ]
机构
[1] Yanshan Univ, Sch Informat Sci & Engn, Qinhuangdao 066004, Hebei, Peoples R China
[2] Key Lab Comp Virtual Technol & Syst Integrat Hebei, Qinhuangdao 066004, Hebei, Peoples R China
[3] Beijing Informat Sci & Technol Univ, Sch Automat, Beijing 100192, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Multiagent deep reinforcement learning; Multiagent epistemic planning; Shared mental model; SMM-MEPP;
D O I
10.1007/s13042-023-01989-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiagent cooperation in a partially observable environment without communication is difficult because of the uncertainty of agents. Traditional multiagent deep reinforcement learning (MADRL) algorithms fail to address this uncertainty. We proposed a MADRL-based policy network architecture called shared mental model-multiagent epistemic planning policy (SMM-MEPP) to resolve this issue. Firstly, this architecture combines multiagent epistemic planning and MADRL to create a "perception-planning-action" multiagent epistemic planning framework, helping multiple agents better handle uncertainty in the absence of coordination. Additionally, by introducing mental models and describing them as neural networks, the parameter-sharing mechanism is used to create shared mental models, maintain the consistency of multiagent planning under the condition of no communication, and improve the efficiency of cooperation. Finally, we applied the SMM-MEPP architecture to three advanced MADRL algorithms (i.e., MAAC, MADDPG, and MAPPO) and conducted comparative experiments in multiagent cooperation tasks. The results show that the proposed method can provide consistent planning for multiple agents and improve the convergence speed or training effect in a partially observable environment without communication.
引用
收藏
页码:1663 / 1675
页数:13
相关论文
共 50 条
  • [1] Consistent epistemic planning for multiagent deep reinforcement learning
    Peiliang Wu
    Shicheng Luo
    Liqiang Tian
    Bingyi Mao
    Wenbai Chen
    International Journal of Machine Learning and Cybernetics, 2024, 15 : 1663 - 1675
  • [2] Path Planning of Multiagent Constrained Formation through Deep Reinforcement Learning
    Sui, Zezhi
    Pu, Zhiqiang
    Yi, Jianqiang
    Tan, Xiangmin
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [3] A survey and critique of multiagent deep reinforcement learning
    Pablo Hernandez-Leal
    Bilal Kartal
    Matthew E. Taylor
    Autonomous Agents and Multi-Agent Systems, 2019, 33 : 750 - 797
  • [4] Deep multiagent reinforcement learning: challenges and directions
    Annie Wong
    Thomas Bäck
    Anna V. Kononova
    Aske Plaat
    Artificial Intelligence Review, 2023, 56 : 5023 - 5056
  • [5] Deep multiagent reinforcement learning: challenges and directions
    Wong, Annie
    Back, Thomas
    Kononova, Anna, V
    Plaat, Aske
    ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (06) : 5023 - 5056
  • [6] A survey and critique of multiagent deep reinforcement learning
    Hernandez-Leal, Pablo
    Kartal, Bilal
    Taylor, Matthew E.
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2019, 33 (06) : 750 - 797
  • [7] Multiagent cooperation and competition with deep reinforcement learning
    Tampuu, Ardi
    Matiisen, Tambet
    Kodelja, Dorian
    Kuzovkin, Ilya
    Korjus, Kristjan
    Aru, Juhan
    Aru, Jaan
    Vicente, Raul
    PLOS ONE, 2017, 12 (04):
  • [8] Deep Multitask Multiagent Reinforcement Learning With Knowledge Transfer
    Mai, Yuxiang
    Zang, Yifan
    Yin, Qiyue
    Ni, Wancheng
    Huang, Kaiqi
    IEEE TRANSACTIONS ON GAMES, 2024, 16 (03) : 566 - 576
  • [9] A Distributional Perspective on Multiagent Cooperation With Deep Reinforcement Learning
    Huang, Liwei
    Fu, Mingsheng
    Rao, Ananya
    Irissappane, Athirai A.
    Zhang, Jie
    Xu, Chengzhong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) : 4246 - 4259
  • [10] Subgoal identification for reinforcement learning and planning in multiagent problem solving
    Chiu, Chung-Cheng
    Soo, Von-Wun
    MULTIAGENT SYSTEM TECHNOLOGIES, PROCEEDINGS, 2007, 4687 : 37 - +