Celebrating Diversity With Subtask Specialization in Shared Multiagent Reinforcement Learning

被引:1
作者
Li, Chenghao [1 ]
Wang, Tonghan [2 ]
Wu, Chengjie [1 ]
Zhao, Qianchuan [1 ]
Yang, Jun [1 ]
Zhang, Chongjie [3 ]
机构
[1] Tsinghua Univ, Beijing 100190, Peoples R China
[2] Harvard Univ, Boston, MA 02115 USA
[3] Washington Univ St Louis, St Louis, MO 63130 USA
基金
中国国家自然科学基金;
关键词
Diversity; information-theoretic learning; interpretability; multiagent reinforcement learning (MARL); subtask specialization;
D O I
10.1109/TNNLS.2023.3326744
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Subtask decomposition offers a promising approach for achieving and comprehending complex cooperative behaviors in multiagent systems. Nonetheless, existing methods often depend on intricate high-level strategies, which can hinder interpretability and learning efficiency. To tackle these challenges, we propose a novel approach that specializes subtasks for subgroups by employing diverse observation representation encoders within information bottlenecks. Moreover, to enhance the efficiency of subtask specialization while promoting sophisticated cooperation, we introduce diversity in both optimization and neural network architectures. These advancements enable our method to achieve state-of-the-art performance and offer interpretable subtask factorization across various scenarios in Google Research Football (GRF).
引用
收藏
页码:2051 / 2065
页数:15
相关论文
共 59 条
[1]  
Andrychowicz M., 2017, Advances in neural information processing systems, V30, P1, DOI DOI 10.5555/3295222.3295258
[2]  
Badia AP, 2020, Arxiv, DOI [arXiv:2002.06038, 10.48550/arXiv.2002.06038]
[3]  
Badia AP, 2020, PR MACH LEARN RES, V119
[4]   Recent Advances in Hierarchical Reinforcement Learning [J].
Andrew G. Barto ;
Sridhar Mahadevan .
Discrete Event Dynamic Systems, 2003, 13 (4) :341-379
[5]  
Böhmer W, 2020, PR MACH LEARN RES, V119
[6]  
Burda Y, 2018, Arxiv, DOI arXiv:1810.12894
[7]  
Campos V, 2019, 25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019)
[8]  
Chane-Sane E, 2021, PR MACH LEARN RES, V139
[9]  
Christianos F., 2021, arXiv
[10]  
Christianos F, 2020, ADV NEUR IN, V33