MaDE: MULTI-SCALE DECISION ENHANCEMENT FOR MULTI-AGENT REINFORCEMENT LEARNING

被引：1

作者：

Ruan, Jingqing ^{[1
,2
]}

Xie, Runpeng ^{[1
,3
]}

Xiong, Xuantang ^{[1
,3
]}

Xu, Shuang ^{[1
]}

Xu, Bo ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Sch Future Technol, Beijing, Peoples R China

[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China

来源：

2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年

基金：

国家重点研发计划;

关键词：

Multi-agent Systems; Reinforcement Learning; Decision-making; Bisimulation;

D O I：

10.1109/ICASSP48485.2024.10447913

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In the domain of multi-agent reinforcement learning (MARL), the limited information availability, complex agent interactions, and individual capabilities among agents often pose a bottleneck for effective decision-making. Previous studies frequently fall short due to insufficient consideration of these multi-dimensional challenges. Thus, this paper introduces a novel methodology, termed Multi-scale Decision Enhancement (MaDE), anchored by a dual-wise bisimulation framework for pre-training agent encoders. The MaDE framework aims to facilitate decision-making across three pivotal dimensions: macroscale awareness, mesoscale coordination, and microscale insight. At the macro level, a pretrained global encoder captures a situational awareness map to guide overall strategies. At the meso level, specialized local encoders generate cluster-based representations to promote inter-agent cooperation. At the micro level, individual agents focus on the accurate decision-making process. Empirical evaluations validate that MaDE outperforms state-of-the-art methods in various multi-agent environments, which shows the potential to tackle the intricate challenges of MARL, enabling agents to make more informed, coordinated, and adaptive decisions. Code is available at https://github.com/paper2023/MaDE.

引用

页码：31 / 35

页数：5

共 36 条

[1]

Amato C, 2013, IEEE DECIS CONTR P, P2398, DOI 10.1109/CDC.2013.6760239

[2] An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination [J].

Cao, Yongcan ;

Yu, Wenwu ;

Ren, Wei ;

Chen, Guanrong .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2013, 9 (01) :427-438

[3]

Dudek G, 1996, AUTON ROBOT, V3, P375, DOI 10.1007/BF00240651

[4]

Ferber J, 2004, LECT NOTES COMPUT SC, V2935, P214

[5]

Ferns N, 2014, UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, P210

[6] BISIMULATION METRICS FOR CONTINUOUS MARKOV DECISION PROCESSES [J].

Ferns, Norm ;

Panangaden, Prakash ;

Precup, Doina .

SIAM JOURNAL ON COMPUTING, 2011, 40 (06) :1662-1714

[7]

Hansen-Estruch P, 2022, PR MACH LEARN RES

[8]

Jeon J, 2022, PR MACH LEARN RES, P10041

[9]

Kim J., 2021, P 34 ADV C NEUR INF, P28336

[10]

Kong YL, 2023, Arxiv, DOI arXiv:2311.11315

← 1 2 3 4 →