Attention-Based Intrinsic Reward Mixing Network for Credit Assignment in Multiagent Reinforcement Learning

被引：5

作者：

Li, Wei ^{[1
]}

Liu, Weiyan ^{[1
]}

Shao, Shitong ^{[1
]}

Huang, Shiyi ^{[1
]}

Song, Aiguo ^{[1
]}

机构：

[1] Southeast Univ, Sch Instrument Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China

来源：

IEEE TRANSACTIONS ON GAMES | 2024年 / 16卷 / 02期

关键词：

Training; Teamwork; Reinforcement learning; Games; Behavioral sciences; Optimization; Task analysis; Attention mechanism; credit assignment; intrinsic reward; mixing network; multiagent reinforcement learning; LEVEL; GAMES;

D O I：

10.1109/TG.2023.3263013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Credit assignment is a critical problem in cooperative multiagent reinforcement learning (MARL). To address this problem, current studies mainly rely on the intrinsic reward, which is directly summed with the global reward to generate a total reward. However, such kinds of intrinsic reward functions ignore the dependence among agents and inevitably limit the adaptivity and effectiveness of MARL methods. In this article, we propose a novel method, Attention-based Intrinsic Reward Mixing Network (AIRMN), for credit assignment in MARL. Specifically, we design a new intrinsic reward network on the basis of the attention mechanism, in order to enhance the effectiveness of teamwork. Besides, we devise a new mixing network that combines the intrinsic and extrinsic rewards in a nonlinear and dynamic manner, so as to adapt the total reward to the variation of the environment. Experimental results on the battle games of StarCraft II demonstrate that AIRMN outperforms the state-of-the-art methods in terms of the average test win rate and also validate that AIRMN can dynamically return the precise intrinsic reward to each agent based on their contributions to the team cooperation, thereby better dealing with the credit assignment problem.

引用

页码：270 / 281

页数：12

共 50 条

[41] MDAEN: Multi-Dimensional Attention-based Ensemble Network in Deep Reinforcement Learning Framework for Portfolio Management
Zhang, Ruiyu
Ren, Xiaotian
Gu, Fengchen
Stefanidis, Angelos
Sun, Ruoyu
Su, Jionglong
2022 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY, CYBERC, 2022, : 143 - 151
[42] Radar Network Time Scheduling for Multi-Target ISAR Task With Game Theory and Multiagent Reinforcement Learning
Liu, Xiao-Wen
Zhang, Qun
Luo, Ying
Lu, Xiaofei
Dong, Chen
IEEE SENSORS JOURNAL, 2021, 21 (04) : 4462 - 4473
[43] Preferential Experience Collection with Frequency based Intrinsic Reward for Deep Reinforcement Learning
Zhang, Hongyin
Tian, Qiangxing
Wang, Donglin
Wei, Kaichen
2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 518 - 525
[44] Attention-based spatio-temporal dependence learning network
Ma, Qianli
Tian, Shuai
Wei, Jia
Wang, Jiabing
Ng, Wing W. Y.
INFORMATION SCIENCES, 2019, 503 (92-108) : 92 - 108
[45] ADRL: An attention-based deep reinforcement learning framework for knowledge graph reasoning
Wang, Qi
Hao, Yongsheng
Cao, Jie
KNOWLEDGE-BASED SYSTEMS, 2020, 197
[46] A Heterogeneous Acceleration System for Attention-Based Multi-Agent Reinforcement Learning
Wiggins, Samuel
Meng, Yuan
Iyer, Mahesh A.
Prasanna, Viktor
2024 34TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL 2024, 2024, : 236 - 242
[47] AcsiNet: Attention-Based Deep Learning Network for CSI Prediction in FDD MIMO Systems
Jiang, Ya
Lin, Wenbin
Zhao, Weikun
Wang, Chaofeng
IEEE WIRELESS COMMUNICATIONS LETTERS, 2023, 12 (03) : 471 - 475
[48] AntiDoteX: Attention-Based Dynamic Optimization for Neural Network Runtime Efficiency
Yu, Fuxun
Xu, Zirui
Liu, Chenchen
Stamoulis, Dimitrios
Wang, Di
Wang, Yanzhi
Chen, Xiang
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (11) : 4694 - 4707
[49] Attention-Based Meta-Reinforcement Learning for Tracking Control of AUV With Time-Varying Dynamics
Jiang, Peng
Song, Shiji
Huang, Gao
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) : 6388 - 6401
[50] Intrinsic Reward with Peer Incentives for Cooperative Multi-Agent Reinforcement Learning
Zhang, Tianle
Liu, Zhen
Wu, Shiguang
Pu, Zhiqiang
Yi, Jianqiang
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,

← 1 2 3 4 5 →