Learning Distinct Strategies for Heterogeneous Cooperative Multi-agent Reinforcement Learning

被引：0

作者：

Wan, Kejia ^{[1
]}

Xu, Xinhai ^{[2
]}

Li, Yuan ^{[2
]}

机构：

[1] Def Innovat Inst, Beijing, Peoples R China

[2] Acad Mil Sci, Beijing, Peoples R China

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV | 2021年 / 12894卷

基金：

中国国家自然科学基金;

关键词：

Multi-agent reinforcement learning; Heterogeneity; Transfer learning;

D O I：

10.1007/978-3-030-86380-7_44

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Value decomposition has been a promising paradigm for cooperative multi-agent reinforcement learning. Many different approaches have been proposed, but few of them consider the heterogeneous settings. Agents with tremendously different behaviours bring great challenges for centralized training with decentralized execution. In this paper, we provide a formulation for the heterogeneous multi-agent reinforcement learning with some theoretical analysis. On top of that, we propose an efficient two-stage heterogeneous learning method. The first stage refers to a transfer technique by tuning existed homogeneous models to heterogeneous ones, which can accelerate the convergent speed. In the second stage, an iterative learning with centralized training is designed to improve the overall performance. We make experiments on heterogeneous unit micromanagement tasks in StarCraft II. The results show that our method could improve the win rate by around 20% for the most difficult scenario, compared with state-of-the-art methods, i.e., QMIX and Weighted QMIX.

引用

页码：544 / 555

页数：12

共 22 条

[11] A Survey on Transfer Learning [J].

Pan, Sinno Jialin ;

Yang, Qiang .

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (10) :1345-1359

[12]

Rashid T., 2020, Advances in Neural Information Processing Systems, V33

[13]

Rashid T, 2018, PR MACH LEARN RES, V80

[14]

Samvelyan M, 2019, AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, P2186

[15]

Son K, 2019, PR MACH LEARN RES, V97

[16]

Sunehag P, 2018, PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS (AAMAS' 18), P2085

[17]

Sutton RS, 2018, ADAPT COMPUT MACH LE, P1

[18]

Tirinzoni A., 2020, P INT C MACH LEARN, P9481

[19]

Wang J., 2020, QPLEX: Duplex Dueling Multi

[20]

Wang TH., 2020, P 37 INT C MACH LEAR, P9876

← 1 2 3 →