Two-stage reward allocation with decay for multi-agent coordinated behavior for sequential cooperative task by using deep reinforcement learning

被引:0
|
作者
Miyashita Y. [1 ,2 ]
Sugawara T. [1 ]
机构
[1] Department of Computer Science and Communications Engineering, Waseda University, Tokyo
[2] Shimizu Corporation, Tokyo
来源
Autonomous Intelligent Systems | / 2卷 / 1期
基金
日本学术振兴会;
关键词
Cooperation; Coordination; Divisional cooperation; Multi-agent deep reinforcement learning;
D O I
10.1007/s43684-022-00029-z
中图分类号
学科分类号
摘要
We propose a two-stage reward allocation method with decay using an extension of replay memory to adapt this rewarding method for deep reinforcement learning (DRL), to generate coordinated behaviors for tasks that can be completed by executing a few subtasks sequentially by heterogeneous agents. An independent learner in cooperative multi-agent systems needs to learn its policies for effective execution of its own responsible subtask, as well as for coordinated behaviors under a certain coordination structure. Although the reward scheme is an issue for DRL, it is difficult to design it to learn both policies. Our proposed method attempts to generate these different behaviors in multi-agent DRL by dividing the timing of rewards into two stages and varying the ratio between them over time. By introducing the coordinated delivery and execution problem with an expiration time, where a task can be executed sequentially by two heterogeneous agents, we experimentally analyze the effect of using various ratios of the reward division in the two-stage allocations on the generated behaviors. The results demonstrate that the proposed method could improve the overall performance relative to those with the conventional one-time or fixed reward and can establish robust coordinated behavior. © 2022, The Author(s).
引用
收藏
相关论文
共 50 条
  • [41] A Multi-Agent Deep Reinforcement Learning Method for Cooperative Load Frequency Control of a Multi-Area Power System
    Yan, Ziming
    Xu, Yan
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2020, 35 (06) : 4599 - 4608
  • [42] Multi-Vehicle Cooperative Decision-Making in Merging Area Based on Deep Multi-Agent Reinforcement Learning
    Gan, Quan
    Li, Bin
    Xiong, Zhengang
    Li, Zhenhua
    Liu, Yanyue
    SUSTAINABILITY, 2024, 16 (22)
  • [43] Satellite-Terrestrial Coordinated Multi-Satellite Beam Hopping Scheduling Based on Multi-Agent Deep Reinforcement Learning
    Lin, Zhiyuan
    Ni, Zuyao
    Kuang, Linling
    Jiang, Chunxiao
    Huang, Zhen
    IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2024, 23 (08) : 10091 - 10103
  • [44] Multi-Agent Deep Reinforcement Learning Based Spectrum Allocation for D2D Underlay Communications
    Li, Zheng
    Guo, Caili
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (02) : 1828 - 1840
  • [45] Multi-agent deep reinforcement learning for end-edge orchestrated resource allocation in industrial wireless networks
    Liu, Xiaoyu
    Xu, Chi
    Yu, Haibin
    Zeng, Peng
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2022, 23 (01) : 47 - 60
  • [46] A Multi-Agent Deep Reinforcement Learning based Spectrum Allocation Framework for D2D Communications
    Li, Zheng
    Guo, Caili
    Xuan, Yidi
    2019 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2019,
  • [47] Action Space-independent Exploration Methods in Multi-agent Deep Reinforcement Learning for Wireless Power Allocation
    Kopic, Amna
    Perenda, Erma
    Gacanin, Haris
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [48] Resources allocation for underwater acoustic soft frequency reuse network based on multi-agent deep reinforcement learning
    Zhang, Yuzhi
    Li, Mengfan
    Feng, Xiaomei
    Han, Xiang
    Jia, Menglei
    PHYSICAL COMMUNICATION, 2024, 67
  • [49] Many-to-Many Task Offloading in Vehicular Fog Computing: A Multi-Agent Deep Reinforcement Learning Approach
    Wei, Zhiwei
    Li, Bing
    Zhang, Rongqing
    Cheng, Xiang
    Yang, Liuqing
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (03) : 2107 - 2122
  • [50] Distributed Optimization for Distribution Grids With Stochastic DER Using Multi-Agent Deep Reinforcement Learning
    Al-Saffar, Mohammed
    Musilek, Petr
    IEEE ACCESS, 2021, 9 : 63059 - 63072