Attention-Based Intrinsic Reward Mixing Network for Credit Assignment in Multiagent Reinforcement Learning

被引:5
|
作者
Li, Wei [1 ]
Liu, Weiyan [1 ]
Shao, Shitong [1 ]
Huang, Shiyi [1 ]
Song, Aiguo [1 ]
机构
[1] Southeast Univ, Sch Instrument Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China
关键词
Training; Teamwork; Reinforcement learning; Games; Behavioral sciences; Optimization; Task analysis; Attention mechanism; credit assignment; intrinsic reward; mixing network; multiagent reinforcement learning; LEVEL; GAMES;
D O I
10.1109/TG.2023.3263013
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Credit assignment is a critical problem in cooperative multiagent reinforcement learning (MARL). To address this problem, current studies mainly rely on the intrinsic reward, which is directly summed with the global reward to generate a total reward. However, such kinds of intrinsic reward functions ignore the dependence among agents and inevitably limit the adaptivity and effectiveness of MARL methods. In this article, we propose a novel method, Attention-based Intrinsic Reward Mixing Network (AIRMN), for credit assignment in MARL. Specifically, we design a new intrinsic reward network on the basis of the attention mechanism, in order to enhance the effectiveness of teamwork. Besides, we devise a new mixing network that combines the intrinsic and extrinsic rewards in a nonlinear and dynamic manner, so as to adapt the total reward to the variation of the environment. Experimental results on the battle games of StarCraft II demonstrate that AIRMN outperforms the state-of-the-art methods in terms of the average test win rate and also validate that AIRMN can dynamically return the precise intrinsic reward to each agent based on their contributions to the team cooperation, thereby better dealing with the credit assignment problem.
引用
收藏
页码:270 / 281
页数:12
相关论文
共 50 条
  • [31] GreenLight: Green Traffic Signal Control using Attention-based Reinforcement Learning on Fog Computing Network
    Tang, Chengyu
    Baskiyar, Sanjeev
    2024 IEEE 15TH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE, IGSC 2024, 2024, : 129 - 134
  • [32] A Cooperation Graph Approach for Multiagent Sparse Reward Reinforcement Learning
    Fu, Qingxu
    Qiu, Tenghai
    Pu, Zhiqiang
    Yi, Jianqiang
    Yuan, Wanmai
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [33] Attention-Based Distributional Reinforcement Learning for Safe and Efficient Autonomous Driving
    Liu, Jia
    Yin, Jianwen
    Jiang, Zhengmin
    Liang, Qingyi
    Li, Huiyun
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (09): : 7477 - 7484
  • [34] Attention-based Deep Reinforcement Learning for Multi-view Environments
    Barati, Elaheh
    Chen, Xuewen
    Zhong, Zichun
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 1805 - 1807
  • [35] ATTENTION-BASED CURIOSITY-DRIVEN EXPLORATION IN DEEP REINFORCEMENT LEARNING
    Reizinger, Patrik
    Szemenyei, Marton
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3542 - 3546
  • [36] Boosting Policy Learning in Reinforcement Learning via Adaptive Intrinsic Reward Regulation
    Zhao, Qian
    Han, Jinhui
    Xu, Mao
    IEEE ACCESS, 2024, 12 : 2224 - 2235
  • [37] A Decentralized Communication Framework Based on Dual-Level Recurrence for Multiagent Reinforcement Learning
    Li, Xuesi
    Li, Jingchen
    Shi, Haobin
    Hwang, Kao-Shing
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (02) : 640 - 649
  • [38] Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems
    Gu, Shanzhi
    Geng, Mingyang
    Lan, Long
    ENTROPY, 2021, 23 (09)
  • [39] Credit assignment in movement-dependent reinforcement learning
    McDougle, Samuel D.
    Boggess, Matthew J.
    Crossley, Matthew J.
    Parvin, Darius
    Ivry, Richard B.
    Taylor, Jordan A.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (24) : 6797 - 6802
  • [40] Malware Classification Using Attention-Based Transductive Learning Network
    Deng, Liting
    Wen, Hui
    Xin, Mingfeng
    Sun, Yue
    Sun, Limin
    Zhu, Hongsong
    SECURITY AND PRIVACY IN COMMUNICATION NETWORKS (SECURECOMM 2020), PT II, 2020, 336 : 403 - 418