Two-stage reward allocation with decay for multi-agent coordinated behavior for sequential cooperative task by using deep reinforcement learning

被引：0

作者：

Miyashita Y. ^{[1
,2
]}

Sugawara T. ^{[1
]}

机构：

[1] Department of Computer Science and Communications Engineering, Waseda University, Tokyo

[2] Shimizu Corporation, Tokyo

来源：

Autonomous Intelligent Systems | / 2卷 / 1期

基金：

日本学术振兴会;

关键词：

Cooperation; Coordination; Divisional cooperation; Multi-agent deep reinforcement learning;

D O I：

10.1007/s43684-022-00029-z

中图分类号：

学科分类号：

摘要：

We propose a two-stage reward allocation method with decay using an extension of replay memory to adapt this rewarding method for deep reinforcement learning (DRL), to generate coordinated behaviors for tasks that can be completed by executing a few subtasks sequentially by heterogeneous agents. An independent learner in cooperative multi-agent systems needs to learn its policies for effective execution of its own responsible subtask, as well as for coordinated behaviors under a certain coordination structure. Although the reward scheme is an issue for DRL, it is difficult to design it to learn both policies. Our proposed method attempts to generate these different behaviors in multi-agent DRL by dividing the timing of rewards into two stages and varying the ratio between them over time. By introducing the coordinated delivery and execution problem with an expiration time, where a task can be executed sequentially by two heterogeneous agents, we experimentally analyze the effect of using various ratios of the reward division in the two-stage allocations on the generated behaviors. The results demonstrate that the proposed method could improve the overall performance relative to those with the conventional one-time or fixed reward and can establish robust coordinated behavior. © 2022, The Author(s).

引用

共 50 条

[1] Analysis of coordinated behavior structures with multi-agent deep reinforcement learning
Yuki Miyashita
Toshiharu Sugawara
Applied Intelligence, 2021, 51 : 1069 - 1085
[2] Analysis of coordinated behavior structures with multi-agent deep reinforcement learning
Miyashita, Yuki
Sugawara, Toshiharu
APPLIED INTELLIGENCE, 2021, 51 (02) : 1069 - 1085
[3] Interpretability for Conditional Coordinated Behavior in Multi-Agent Reinforcement Learning
Motokawa, Yoshinari
Sugawara, Toshiharu
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[4] Coordinated behavior of cooperative agents using deep reinforcement learning
Diallo, Elhadji Amadou Oury
Sugiyama, Ayumi
Sugawara, Toshiharu
NEUROCOMPUTING, 2020, 396 : 230 - 240
[5] Cooperative task assignment in spatial crowdsourcing via multi-agent deep reinforcement learning?
Zhao, Pengcheng
Li, Xiang
Gao, Shang
Wei, Xiaohui
JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 128
[6] Non-cooperative multi-agent deep reinforcement learning for channel resource allocation in vehicular networks
Zhang, Fuxin
Yao, Sihan
Liu, Wei
Qi, Liang
COMPUTER NETWORKS, 2025, 257
[7] Distributed Task Offloading based on Multi-Agent Deep Reinforcement Learning
Hu, Shucheng
Ren, Tao
Niu, Jianwei
Hu, Zheyuan
Xing, Guoliang
2021 17TH INTERNATIONAL CONFERENCE ON MOBILITY, SENSING AND NETWORKING (MSN 2021), 2021, : 575 - 583
[8] Multi-Agent Reinforcement Learning for Cooperative Task Offloading in Internet-of-Vehicles
Lei, Yuchen
Jiang, Kai
Wang, Zhenning
Cao, Yue
Lin, Hai
Chen, Liang
2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
[9] A Multi-Agent Deep Reinforcement Learning Based Voltage Regulation Using Coordinated PV Inverters
Cao, Di
Hu, Weihao
Zhao, Junbo
Huang, Qi
Chen, Zhe
Blaabjerg, Frede
IEEE TRANSACTIONS ON POWER SYSTEMS, 2020, 35 (05) : 4120 - 4123
[10] Cooperative Multi-Agent Reinforcement Learning with Dynamic Target Localization: A Reward Sharing Approach
Wickramaarachchi, Helani
Kirley, Michael
Geard, Nicholas
ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT II, 2024, 14472 : 310 - 324

← 1 2 3 4 5 →