Cooperative Multi-Agent Reinforcement Learning with Hierarchical Relation Graph under Partial Observability

被引:8
作者
Li, Yang [1 ]
Wang, Xinzhi [1 ]
Wang, Jianshu [1 ]
Wang, Wei [1 ]
Luo, Xiangfeng [1 ]
Xie, Shaorong [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai, Peoples R China
来源
2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI) | 2020年
基金
中国国家自然科学基金;
关键词
Reinforcement Learning; Multi-Agent; Hierarchical Relation Graph;
D O I
10.1109/ICTAI50040.2020.00011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cooperation among agents with partial observation is an important task in multi-agent reinforcement learning (MARL), aiming to maximize a common reward. Most existing cooperative MARL approaches focus on building different model frameworks, such as centralized, decentralized, and centralized training with decentralized execution. These methods employ partial observation of agents as input directly, but rarely consider the local relationship between agents. The local relationship can help agents integrate observation information among different agents in a local range, and then adopt a more effective cooperation policy. In this paper, we propose a MARL method based on spatial relationship called hierarchical relation graph soft actorcritic (HRG-SAC). The method first uses a hierarchical relation graph generation module to represent the spatial relationship between agents in local space. Second, it integrates feature information of the relation graph through the graph convolution network (GCN). Finally, the soft actor-critic (SAC) is used to optimize agents' actions in training for compliance control. We conduct experiments on the Food Collector task and compare HRG-SAC with three baseline methods. The results demonstrate that the hierarchical relation graph can significantly improve MARL performance in the cooperative task.T
引用
收藏
页码:1 / 8
页数:8
相关论文
共 50 条
[41]   GLOBAL-LOCALIZED AGENT GRAPH CONVOLUTION FOR MULTI-AGENT REINFORCEMENT LEARNING [J].
Liu, Yuntao ;
Dou, Yong ;
Shen, Siqi ;
Qiao, Peng .
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :3480-3484
[42]   Learning Cooperative Multi-Agent Policies With Partial Reward Decoupling [J].
Freed, Benjamin ;
Kapoor, Aditya ;
Abraham, Ian ;
Schneider, Jeff ;
Choset, Howie .
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (02) :890-897
[43]   Enhanced Cooperative Multi-agent Learning Algorithms (ECMLA) using Reinforcement Learning [J].
Vidhate, Deepak A. ;
Kulkarni, Parag .
2016 INTERNATIONAL CONFERENCE ON COMPUTING, ANALYTICS AND SECURITY TRENDS (CAST), 2016, :556-561
[44]   Intelligent Spectrum Sensing and Access With Partial Observation Based on Hierarchical Multi-Agent Deep Reinforcement Learning [J].
Li, Xuanheng ;
Zhang, Yulong ;
Ding, Haichuan ;
Fang, Yuguang .
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (04) :3131-3145
[45]   Cooperative reinforcement learning in topology-based multi-agent systems [J].
Dan Xiao ;
Ah-Hwee Tan .
Autonomous Agents and Multi-Agent Systems, 2013, 26 :86-119
[46]   Cooperative Multi-Agent Reinforcement Learning with Constraint-Reduced DCOP [J].
Yi Xie ;
Zhongyi Liu ;
Zhao Liu ;
Yijun Gu .
JournalofBeijingInstituteofTechnology, 2017, 26 (04) :525-533
[47]   Distributed cooperative reinforcement learning for multi-agent system with collision avoidance [J].
Lan, Xuejing ;
Yan, Jiapei ;
He, Shude ;
Zhao, Zhijia ;
Zou, Tao .
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (01) :567-585
[48]   A Cooperative Multi-Agent Reinforcement Learning Method Based on Coordination Degree [J].
Cui, Haoyan ;
Zhang, Zhen .
IEEE ACCESS, 2021, 9 :123805-123814
[49]   Testing Reinforcement Learning Explainability Methods in a Multi-Agent Cooperative Environment [J].
Domenech i Vila, Marc ;
Gnatyshak, Dmitry ;
Tormos, Adrian ;
Alvarez-Napagao, Sergio .
ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2022, 356 :355-364
[50]   Autonomous and cooperative control of UAV cluster with multi-agent reinforcement learning [J].
Xu, D. ;
Chen, G. .
AERONAUTICAL JOURNAL, 2022, 126 (1300) :932-951