Bottom-up multi-agent reinforcement learning by reward shaping for cooperative-competitive tasks

被引:0
|
作者
Takumi Aotani
Taisuke Kobayashi
Kenji Sugimoto
机构
[1] Nara Institute of Science and Technology,Division of Information Science
来源
Applied Intelligence | 2021年 / 51卷
关键词
Distributed autonomous system; Reinforcement learning; Reward shaping; Interests between agents;
D O I
暂无
中图分类号
学科分类号
摘要
A multi-agent system (MAS) is expected to be applied to various real-world problems where a single agent cannot accomplish given tasks. Due to the inherent complexity in the real-world MAS, however, manual design of group behaviors of agents is intractable. Multi-agent reinforcement learning (MARL), which is a framework for multiple agents in the same environment to learn their policies adaptively by using reinforcement learning, would be a promising methodology for such complexity in the MAS. To acquire the group behaviors by MARL, all the agents are required to understand how to achieve the respective tasks cooperatively. So far, we have proposed “bottom-up MARL”, which is a decentralized system to manage real and large-scale MARL, with a reward shaping algorithm to represent the group behaviors. The reward shaping algorithm, however, assumes that all the agents are in cooperative relationships to some extent. In this paper, therefore, we extend this algorithm to allow the agents not to know the interests between them. The interests are regarded as correlation coefficients derived from the agents’ rewards, which are numerically estimated in an online manner. Actually, in both simulations and real experiments without knowledge of the interests between the agents, they correctly estimated their interests, thereby allowing them to derive their new rewards to represent the feasible group behaviors in the decentralized manner. As a result, our extended algorithm succeeded in acquiring the group behaviors from cooperative tasks to competitive tasks.
引用
收藏
页码:4434 / 4452
页数:18
相关论文
共 50 条
  • [21] Reward shaping for knowledge-based multi-objective multi-agent reinforcement learning
    Mannion, Patrick
    Devlin, Sam
    Duggan, Jim
    Howley, Enda
    KNOWLEDGE ENGINEERING REVIEW, 2018, 33
  • [22] Credit assignment in heterogeneous multi-agent reinforcement learning for fully cooperative tasks
    Jiang, Kun
    Liu, Wenzhang
    Wang, Yuanda
    Dong, Lu
    Sun, Changyin
    APPLIED INTELLIGENCE, 2023, 53 (23) : 29205 - 29222
  • [23] Credit assignment in heterogeneous multi-agent reinforcement learning for fully cooperative tasks
    Kun Jiang
    Wenzhang Liu
    Yuanda Wang
    Lu Dong
    Changyin Sun
    Applied Intelligence, 2023, 53 : 29205 - 29222
  • [24] Direct reward and indirect reward in multi-agent reinforcement learning
    Ohta, M
    ROBOCUP 2002: ROBOT SOCCER WORLD CUP VI, 2003, 2752 : 359 - 366
  • [25] Direct reward and indirect reward in multi-agent reinforcement learning
    Ohta, M. (ohta@carc.aist.go.jp), (Springer Verlag):
  • [26] Tactical reward shaping for large-scale combat by multi-agent reinforcement learning
    DUO Nanxun
    WANG Qinzhao
    LYU Qiang
    WANG Wei
    Journal of Systems Engineering and Electronics, 2024, 35 (06) : 1516 - 1529
  • [27] Tactical Reward Shaping for Large-Scale Combat by Multi-Agent Reinforcement Learning
    Duo, Nanxun
    Wang, Qinzhao
    Lyu, Qiang
    Wang, Wei
    JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2024, 35 (06) : 1516 - 1529
  • [28] Multi-Agent Uncertainty Sharing for Cooperative Multi-Agent Reinforcement Learning
    Chen, Hao
    Yang, Guangkai
    Zhang, Junge
    Yin, Qiyue
    Huang, Kaiqi
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [29] Reinforcement Learning for Multi-Agent Competitive Scenarios
    Coutinho, Manuel
    Reis, Luis Paulo
    2022 IEEE INTERNATIONAL CONFERENCE ON AUTONOMOUS ROBOT SYSTEMS AND COMPETITIONS (ICARSC), 2022, : 130 - 135
  • [30] LJIR: Learning Joint-Action Intrinsic Reward in cooperative multi-agent reinforcement learning
    Chen, Zihan
    Luo, Biao
    Hu, Tianmeng
    Xu, Xiaodong
    NEURAL NETWORKS, 2023, 167 : 450 - 459