Attention-Based Intrinsic Reward Mixing Network for Credit Assignment in Multiagent Reinforcement Learning

被引：5

作者：

Li, Wei ^{[1
]}

Liu, Weiyan ^{[1
]}

Shao, Shitong ^{[1
]}

Huang, Shiyi ^{[1
]}

Song, Aiguo ^{[1
]}

机构：

[1] Southeast Univ, Sch Instrument Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China

来源：

IEEE TRANSACTIONS ON GAMES | 2024年 / 16卷 / 02期

关键词：

Training; Teamwork; Reinforcement learning; Games; Behavioral sciences; Optimization; Task analysis; Attention mechanism; credit assignment; intrinsic reward; mixing network; multiagent reinforcement learning; LEVEL; GAMES;

D O I：

10.1109/TG.2023.3263013

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Credit assignment is a critical problem in cooperative multiagent reinforcement learning (MARL). To address this problem, current studies mainly rely on the intrinsic reward, which is directly summed with the global reward to generate a total reward. However, such kinds of intrinsic reward functions ignore the dependence among agents and inevitably limit the adaptivity and effectiveness of MARL methods. In this article, we propose a novel method, Attention-based Intrinsic Reward Mixing Network (AIRMN), for credit assignment in MARL. Specifically, we design a new intrinsic reward network on the basis of the attention mechanism, in order to enhance the effectiveness of teamwork. Besides, we devise a new mixing network that combines the intrinsic and extrinsic rewards in a nonlinear and dynamic manner, so as to adapt the total reward to the variation of the environment. Experimental results on the battle games of StarCraft II demonstrate that AIRMN outperforms the state-of-the-art methods in terms of the average test win rate and also validate that AIRMN can dynamically return the precise intrinsic reward to each agent based on their contributions to the team cooperation, thereby better dealing with the credit assignment problem.

引用

页码：270 / 281

页数：12

共 50 条

[21] Learning Automata-Based Multiagent Reinforcement Learning for Optimization of Cooperative Tasks
Zhang, Zhen
Wang, Dongqing
Gao, Junwei
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (10) : 4639 - 4652
[22] Closely Cooperative Multi-Agent Reinforcement Learning Based on Intention Sharing and Credit Assignment
Fu, Hao
You, Mingyu
Zhou, Hongjun
He, Bin
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (12): : 11770 - 11777
[23] Multiagent reinforcement learning for strictly constrained tasks based on Reward Recorder
Ding, Lifu
Yan, Gangfeng
Liu, Jianing
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (11) : 8387 - 8411
[24] Reward Shaping Based Federated Reinforcement Learning
Hu, Yiqiu
Hua, Yun
Liu, Wenyan
Zhu, Jun
IEEE ACCESS, 2021, 9 : 67259 - 67267
[25] Priority Over Quantity: A Self-Incentive Credit Assignment Scheme for Cooperative Multiagent Reinforcement Learning
Tang, Hao
Wang, Cheng
Chang, Shengbo
Zhang, Junqi
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (06): : 7766 - 7777
[26] Mean-Field Multiagent Reinforcement Learning: A Decentralized Network Approach
Gu, Haotian
Guo, Xin
Wei, Xiaoli
Xu, Renyuan
MATHEMATICS OF OPERATIONS RESEARCH, 2025, 50 (01) : 506 - 536
[27] A Proactive Eavesdropping Game in MIMO Systems Based on Multiagent Deep Reinforcement Learning
Guo, Delin
Ding, Hui
Tang, Lan
Zhang, Xinggan
Yang, Lvxi
Liang, Ying-Chang
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2022, 21 (11) : 8889 - 8904
[28] Rewards Prediction-Based Credit Assignment for Reinforcement Learning With Sparse Binary Rewards
Seo, Minah
Vecchietti, Luiz Felipe
Lee, Sangkeum
Har, Dongsoo
IEEE ACCESS, 2019, 7 : 118776 - 118791
[29] Attention-based Deep Learning for Network Intrusion Detection
Guo, Naiwang
Tian, Yingjie
Li, Fan
Yang, Hongshan
2020 INTERNATIONAL CONFERENCE ON IMAGE, VIDEO PROCESSING AND ARTIFICIAL INTELLIGENCE, 2020, 11584
[30] Fear based Intrinsic Reward as a Barrier Function for Continuous Reinforcement Learning
Sanchez, Rodney
Sahin, Ferat
Heard, Jamison
2024 19TH ANNUAL SYSTEM OF SYSTEMS ENGINEERING CONFERENCE, SOSE 2024, 2024, : 140 - 146

← 1 2 3 4 5 →