MaDE: MULTI-SCALE DECISION ENHANCEMENT FOR MULTI-AGENT REINFORCEMENT LEARNING

被引:1
作者
Ruan, Jingqing [1 ,2 ]
Xie, Runpeng [1 ,3 ]
Xiong, Xuantang [1 ,3 ]
Xu, Shuang [1 ]
Xu, Bo [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Future Technol, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年
基金
国家重点研发计划;
关键词
Multi-agent Systems; Reinforcement Learning; Decision-making; Bisimulation;
D O I
10.1109/ICASSP48485.2024.10447913
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In the domain of multi-agent reinforcement learning (MARL), the limited information availability, complex agent interactions, and individual capabilities among agents often pose a bottleneck for effective decision-making. Previous studies frequently fall short due to insufficient consideration of these multi-dimensional challenges. Thus, this paper introduces a novel methodology, termed Multi-scale Decision Enhancement (MaDE), anchored by a dual-wise bisimulation framework for pre-training agent encoders. The MaDE framework aims to facilitate decision-making across three pivotal dimensions: macroscale awareness, mesoscale coordination, and microscale insight. At the macro level, a pretrained global encoder captures a situational awareness map to guide overall strategies. At the meso level, specialized local encoders generate cluster-based representations to promote inter-agent cooperation. At the micro level, individual agents focus on the accurate decision-making process. Empirical evaluations validate that MaDE outperforms state-of-the-art methods in various multi-agent environments, which shows the potential to tackle the intricate challenges of MARL, enabling agents to make more informed, coordinated, and adaptive decisions. Code is available at https://github.com/paper2023/MaDE.
引用
收藏
页码:31 / 35
页数:5
相关论文
共 36 条
[1]  
Amato C, 2013, IEEE DECIS CONTR P, P2398, DOI 10.1109/CDC.2013.6760239
[2]   An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination [J].
Cao, Yongcan ;
Yu, Wenwu ;
Ren, Wei ;
Chen, Guanrong .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2013, 9 (01) :427-438
[3]  
Dudek G, 1996, AUTON ROBOT, V3, P375, DOI 10.1007/BF00240651
[4]  
Ferber J, 2004, LECT NOTES COMPUT SC, V2935, P214
[5]  
Ferns N, 2014, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, P210
[6]   BISIMULATION METRICS FOR CONTINUOUS MARKOV DECISION PROCESSES [J].
Ferns, Norm ;
Panangaden, Prakash ;
Precup, Doina .
SIAM JOURNAL ON COMPUTING, 2011, 40 (06) :1662-1714
[7]  
Hansen-Estruch P, 2022, PR MACH LEARN RES
[8]  
Jeon J, 2022, PR MACH LEARN RES, P10041
[9]  
Kim J., 2021, P 34 ADV C NEUR INF, P28336
[10]  
Kong YL, 2023, Arxiv, DOI arXiv:2311.11315