GCEN: Multiagent Deep Reinforcement Learning With Grouped Cognitive Feature Representation

被引：2

作者：

Gao, Hao ^{[1
]}

Xu, Xin ^{[1
]}

Yan, Chao ^{[1
]}

Lan, Yixing ^{[1
]}

Yao, Kangxing ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Peoples R China

来源：

IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS | 2024年 / 16卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Attention mechanism; cognitive feature extraction; cooperative multiagent reinforcement learning (RL); deep learning;

D O I：

10.1109/TCDS.2023.3323987

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, cooperative multiagent deep reinforcement learning (MADRL) has received increasing research interest and has been widely applied to computer games and coordinated multirobot systems, etc. However, it is still challenging to realize high-solution quality and learning efficiency for MADRL under the conditions of incomplete and noisy observations. To this end, this article proposes an MADRL approach with grouped cognitive feature representation (GCEN), following the paradigm of centralized training and decentralized execution (CTDE). Different from previous works, GCEN incorporates a new cognitive feature representation that combines a grouped attention mechanism and a training approach using mutual information (MI). The grouped attention mechanism is proposed to selectively extract entity features within the observation field for each agent while avoiding the influence of irrelevant observations. The MI regularization term is designed to guide the agents to learn grouped cognitive features based on global information, aiming to mitigate the influence of partial observations. The proposed GCEN approach can be extended as a feature representation module to different MADRL methods. Extensive experiments on the challenging level-based foraging and StarCraft II micromanagement benchmarks were conducted to illustrate the effectiveness and advantages of the proposed approach. Compared with seven representative MADRL algorithms, our proposed approach achieves state-of-the-art performance in winning rates and training efficiency. Experimental results further demonstrate that GCEN has improved generalization ability across varying sight ranges.

引用

页码：458 / 473

页数：16

共 49 条

[1]

Alemi A. A., 2016, PROC INT C LEARN REP, P1

[2] Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs [J].

Amato, Christopher ;

Bernstein, Daniel S. ;

Zilberstein, Shlomo .

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2010, 21 (03) :293-320

[3]

Anandkumar Animashree, 2016, C LEARNING THEORY, P193

[4] OPTIMAL CONTROL OF MARKOV PROCESSES WITH INCOMPLETE STATE INFORMATION [J].

ASTROM, KJ .

JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 1965, 10 (01) :174-&

[5] Social cognitive theory: An agentic perspective [J].

Bandura, A .

ANNUAL REVIEW OF PSYCHOLOGY, 2001, 52 :1-26

[6] UNMAS: Multiagent Reinforcement Learning for Unshaped Cooperative Scenarios [J].

Chai, Jiajun ;

Li, Weifan ;

Zhu, Yuanheng ;

Zhao, Dongbin ;

Ma, Zhe ;

Sun, Kewu ;

Ding, Jishiyu .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (04) :2093-2104

[7]

Chen CC, 2020, AAAI CONF ARTIF INTE, V34, P3414

[8] PowerNet: Multi-Agent Deep Reinforcement Learning for Scalable Powergrid Control [J].

Chen, Dong ;

Chen, Kaian ;

Li, Zhaojian ;

Chu, Tianshu ;

Yao, Rui ;

Qiu, Feng ;

Lin, Kaixiang .

IEEE TRANSACTIONS ON POWER SYSTEMS, 2022, 37 (02) :1007-1017

[9]

Christianos F, 2021, PR MACH LEARN RES, V139

[10]

Chung J., 2014, NIPS 2014 WORKSH DEE, DOI DOI 10.48550/ARXIV.1412.3555

← 1 2 3 4 5 →