Two-stage reward allocation with decay for multi-agent coordinated behavior for sequential cooperative task by using deep reinforcement learning

被引：0

作者：

Miyashita Y. ^{[1
,2
]}

Sugawara T. ^{[1
]}

机构：

[1] Department of Computer Science and Communications Engineering, Waseda University, Tokyo

[2] Shimizu Corporation, Tokyo

来源：

Autonomous Intelligent Systems | / 2卷 / 1期

基金：

日本学术振兴会;

关键词：

Cooperation; Coordination; Divisional cooperation; Multi-agent deep reinforcement learning;

D O I：

10.1007/s43684-022-00029-z

中图分类号：

学科分类号：

摘要：

We propose a two-stage reward allocation method with decay using an extension of replay memory to adapt this rewarding method for deep reinforcement learning (DRL), to generate coordinated behaviors for tasks that can be completed by executing a few subtasks sequentially by heterogeneous agents. An independent learner in cooperative multi-agent systems needs to learn its policies for effective execution of its own responsible subtask, as well as for coordinated behaviors under a certain coordination structure. Although the reward scheme is an issue for DRL, it is difficult to design it to learn both policies. Our proposed method attempts to generate these different behaviors in multi-agent DRL by dividing the timing of rewards into two stages and varying the ratio between them over time. By introducing the coordinated delivery and execution problem with an expiration time, where a task can be executed sequentially by two heterogeneous agents, we experimentally analyze the effect of using various ratios of the reward division in the two-stage allocations on the generated behaviors. The results demonstrate that the proposed method could improve the overall performance relative to those with the conventional one-time or fixed reward and can establish robust coordinated behavior. © 2022, The Author(s).

引用

共 50 条

[31] Joint Optimization of Handover Control and Power Allocation Based on Multi-Agent Deep Reinforcement Learning
Guo, Delin
Tang, Lan
Zhang, Xinggan
Liang, Ying-Chang
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (11) : 13124 - 13138
[32] Multi-agent Deep Reinforcement Learning Based Channel Allocation for Networked Satellite Telemetry System
Zeng, Guanming
Zhan, Yafeng
Chen, Guanyu
ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 5539 - 5545
[33] ARTIFICIAL INTELLIGENCE FOR COOPERATIVE COLLISION AVOIDANCE OF SHIPS DEVELOPED BY MULTI-AGENT DEEP REINFORCEMENT LEARNING
Yoshioka, Hitoshi
Hashimoto, Hirotada
Matsuda, Akihiko
PROCEEDINGS OF ASME 2024 43RD INTERNATIONAL CONFERENCE ON OCEAN, OFFSHORE AND ARCTIC ENGINEERING, OMAE2024, VOL 6, 2024,
[34] Learning Efficient Coordination Strategy for Multi-step Tasks in Multi-agent Systems using Deep Reinforcement Learning
Zhu, Zean
Diallo, Elhadji Amadou Oury
Sugawara, Toshiharu
ICAART: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1, 2020, : 287 - 294
[35] Sharing of Energy Among Cooperative Households Using Distributed Multi-Agent Reinforcement Learning
Ebell, Niklas
Guetlein, Moritz
Pruckner, Marco
PROCEEDINGS OF 2019 IEEE PES INNOVATIVE SMART GRID TECHNOLOGIES EUROPE (ISGT-EUROPE), 2019,
[36] Multi-agent deep reinforcement learning for adaptive coordinated metro service operations with flexible train composition
Ying, Cheng-Shuo
Chow, Andy H. F.
Nguyen, Hoa T. M.
Chin, Kwai-Sang
TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2022, 161 : 36 - 59
[37] Counterfactual Reward Estimation for Credit Assignment in Multi-agent Deep Reinforcement Learning over Wireless Video Transmission
Yu, Wenhan
Qian, Liangxin
Chua, Terence Jie
Zhao, Jun
2024 IEEE 44TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, ICDCS 2024, 2024, : 1177 - 1189
[38] Traffic signal control using a cooperative EWMA-based multi-agent reinforcement learning
Qiao, Zhimin
Ke, Liangjun
Wang, Xiaoqiang
APPLIED INTELLIGENCE, 2023, 53 (04) : 4483 - 4498
[39] Cooperative Optimization Strategy for Distributed Energy Resource System using Multi-Agent Reinforcement Learning
Liu, Zhaoyang
Xiang, Tianchun
Wang, Tianhao
Mu, Chaoxu
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
[40] Cooperative Multi-UAV Positioning for Aerial Internet Service Management: A Multi-Agent Deep Reinforcement Learning Approach
Kim, Joongheon
Park, Soohyun
Jung, Soyi
Cordeiro, Carlos
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2024, 21 (04): : 3797 - 3812

← 1 2 3 4 5 →