Sample-efficient multi-agent reinforcement learning with masked reconstruction

被引：0

作者：

Kim, Jung In ^{[1
]}

Lee, Young Jae ^{[1
]}

Heo, Jongkook ^{[1
]}

Park, Jinhyeok ^{[1
]}

Kim, Jaehoon ^{[1
]}

Lim, Sae Rin ^{[1
]}

Jeong, Jinyong ^{[1
]}

Kim, Seoung Bum ^{[1
]}

机构：

[1] Korea Univ, Sch Ind & Management Engn, Seoul, South Korea

来源：

PLOS ONE | 2023年 / 18卷 / 09期

关键词：

D O I：

10.1371/journal.pone.0291545

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Deep reinforcement learning (DRL) is a powerful approach that combines reinforcement learning (RL) and deep learning to address complex decision-making problems in high-dimensional environments. Although DRL has been remarkably successful, its low sample efficiency necessitates extensive training times and large amounts of data to learn optimal policies. These limitations are more pronounced in the context of multi-agent reinforcement learning (MARL). To address these limitations, various studies have been conducted to improve DRL. In this study, we propose an approach that combines a masked reconstruction task with QMIX (M-QMIX). By introducing a masked reconstruction task as an auxiliary task, we aim to achieve enhanced sample efficiency-a fundamental limitation of RL in multi-agent systems. Experiments were conducted using the StarCraft II micromanagement benchmark to validate the effectiveness of the proposed method. We used 11 scenarios comprising five easy, three hard, and three very hard scenarios. We particularly focused on using a limited number of time steps for each scenario to demonstrate the improved sample efficiency. Compared to QMIX, the proposed method is superior in eight of the 11 scenarios. These results provide strong evidence that the proposed method is more sample-efficient than QMIX, demonstrating that it effectively addresses the limitations of DRL in multi-agent systems.

引用

页数：14

共 50 条

[41] Hierarchical multi-agent reinforcement learning [J].

Mohammad Ghavamzadeh ;

Sridhar Mahadevan ;

Rajbala Makar .

Autonomous Agents and Multi-Agent Systems, 2006, 13 :197-229

[42] Learning to Share in Multi-Agent Reinforcement Learning [J].

Yi, Yuxuan ;

Li, Ge ;

Wang, Yaowei ;

Lu, Zongqing .

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,

[43] Multi-agent reinforcement learning: A survey [J].

Busoniu, Lucian ;

Babuska, Robert ;

De Schutter, Bart .

2006 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1- 5, 2006, :1133-+

[44] Partitioning in multi-agent reinforcement learning [J].

Sun, R ;

Peterson, T .

FROM ANIMALS TO ANIMATS 6, 2000, :325-332

[45] Multi-Agent Reinforcement Learning for Microgrids [J].

Dimeas, A. L. ;

Hatziargyriou, N. D. .

IEEE POWER AND ENERGY SOCIETY GENERAL MEETING 2010, 2010,

[46] Hierarchical multi-agent reinforcement learning [J].

Ghavamzadeh, Mohammad ;

Mahadevan, Sridhar ;

Makar, Rajbala .

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2006, 13 (02) :197-229

[47] Multi-agent Exploration with Reinforcement Learning [J].

Sygkounas, Alkis ;

Tsipianitis, Dimitris ;

Nikolakopoulos, George ;

Bechlioulis, Charalampos P. .

2022 30TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2022, :630-635

[48] The Dynamics of Multi-Agent Reinforcement Learning [J].

Dickens, Luke ;

Broda, Krysia ;

Russo, Alessandra .

ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2010, 215 :367-372

[49] Explanations for Multi-Agent Reinforcement Learning [J].

Boggess, Kayla .

THIRTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, AAAI-25, VOL 39 NO 28, 2025, :29245-29246

[50] Learning Better with Less: Effective Augmentation for Sample-Efficient Visual Reinforcement Learning [J].

Ma, Guozheng ;

Zhang, Linrui ;

Wang, Haoyu ;

Li, Lu ;

Wang, Zilin ;

Wang, Zhen ;

Shen, Li ;

Wang, Xueqian ;

Tao, Dacheng .

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,

← 1 2 3 4 5 →