Temporal Inconsistency-Based Intrinsic Reward for Multi-Agent Reinforcement Learning

被引:0
|
作者
Sun, Shaoqi [1 ]
Xu, Kele [1 ]
机构
[1] Natl Univ Def Technol, Natl Key Lab Parallel & Distributed Proc, Changsha, Peoples R China
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
关键词
D O I
10.1109/IJCNN54540.2023.10191420
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent reinforcement learning (MARL) has shown promising results in many challenging sequential decision-making tasks. Recently, deep neural networks have dominated this field. However, the policy networks of agent's may fall into local optimum during the training phase, which severely constrains the performance of exploration. To address this issue, we propose a novel MARL learning framework named PSAM, which contains a new temporal inconsistency-based intrinsic reward and a diversity control strategy. Specifically, we save the parameters of the deep models along the optimization path of the agent's policy network, which can be denoted as snapshots. Through measuring the difference between snapshots, we can employ the difference as an intrinsic reward. Moreover, we propose a diversity control strategy to improve the performance further. Finally, to verify the effectiveness of the proposed method, we conduct extensive experiments in several widely used MARL environments. The results show that in many environments, PSAM can not only achieve state-of-the-art performance and prevent the policy network from getting stuck in local minima but also accelerate the agent's learning of the policy. It is worth noting that the proposed regularizer can be used using a plug-and-play manner without introducing any additional hyper-parameters and training costs.
引用
收藏
页数:7
相关论文
共 50 条
  • [31] Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
    Qu, Guannan
    Lin, Yiheng
    Wierman, Adam
    Li, Na
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [32] Reinforcement learning based on multi-agent in RoboCup
    Zhang, W
    Li, JG
    Ruan, XG
    ADVANCES IN INTELLIGENT COMPUTING, PT 1, PROCEEDINGS, 2005, 3644 : 967 - 975
  • [33] Reinforcement Learning for Multi-Agent Systems with Temporal Logic Specifications
    Terashima, Keita
    Kobayashi, Koichi
    Yamashita, Yuh
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2024, E107A (01) : 31 - 37
  • [34] Multi-Agent Reinforcement Learning
    Stankovic, Milos
    2016 13TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2016, : 43 - 43
  • [35] Social Influence as Intrinsic Motivation for Multi-Agent Deep Reinforcement Learning
    Jaques, Natasha
    Lazaridou, Angeliki
    Hughes, Edward
    Gulcehre, Caglar
    Ortega, Pedro A.
    Strouse, D. J.
    Leibo, Joel Z.
    de Freitas, Nando
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [36] Mobile User Interface Adaptation Based on Usability Reward Model and Multi-Agent Reinforcement Learning
    Vidmanov, Dmitry
    Alfimtsev, Alexander
    MULTIMODAL TECHNOLOGIES AND INTERACTION, 2024, 8 (04)
  • [37] Intrinsic Action Tendency Consistency for Cooperative Multi-Agent Reinforcement Learning
    Zhang, Junkai
    Zhang, Yifan
    Zhang, Xi Sheryl
    Zang, Yifan
    Cheng, Jian
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17600 - 17608
  • [38] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
    Duc Thien Nguyen
    Yeoh, William
    Lau, Hoong Chuin
    Zilberstein, Shlomo
    Zhang, Chongjie
    PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2014, : 1447 - 1455
  • [39] Multi-Agent Deep Reinforcement Learning With Progressive Negative Reward for Cryptocurrency Trading
    Kumlungmak, Kittiwin
    Vateekul, Peerapon
    IEEE ACCESS, 2023, 11 : 66440 - 66455
  • [40] Leaders and Collaborators: Addressing Sparse Reward Challenges in Multi-Agent Reinforcement Learning
    Sun, Shaoqi
    Liu, Hui
    Xu, Kele
    Ding, Bo
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024,