A Decentralized Communication Framework Based on Dual-Level Recurrence for Multiagent Reinforcement Learning

被引:3
|
作者
Li, Xuesi [1 ]
Li, Jingchen [1 ]
Shi, Haobin [1 ]
Hwang, Kao-Shing [2 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci & Engn, Xian 710129, Shaanxi, Peoples R China
[2] Natl Sun Yat Sen Univ, Dept Elect Engn, Kaohsiung 804, Taiwan
基金
中国国家自然科学基金;
关键词
Reinforcement learning; Logic gates; Training; Adaptation models; Multi-agent systems; Task analysis; Decision making; Gated recurrent network; multiagent reinforcement learning; multiagent system;
D O I
10.1109/TCDS.2023.3281878
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Designing communication channels for multiagent is a feasible method to conduct decentralized learning, especially in partially observable environments or large-scale multiagent systems. In this work, a communication model with dual-level recurrence is developed to provide a more efficient communication mechanism for the multiagent reinforcement learning field. The communications are conducted by a gated-attention-based recurrent network, in which the historical states are taken into account and regarded as the second-level recurrence. We separate communication messages from memories in the recurrent model so that the proposed communication flow can adapt changeable communication objects in the case of limited communication, and the communication results are fair to every agent. We provide a sufficient discussion about our method in both partially observable and fully observable environments. The results of several experiments suggest our method outperforms the existing decentralized communication frameworks and the corresponding centralized training method.
引用
收藏
页码:640 / 649
页数:10
相关论文
共 50 条
  • [31] Learning Automata-Based Multiagent Reinforcement Learning for Optimization of Cooperative Tasks
    Zhang, Zhen
    Wang, Dongqing
    Gao, Junwei
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (10) : 4639 - 4652
  • [32] Unsupervised Visual Representation Learning via Dual-Level Progressive Similar Instance Selection
    Fan, Hehe
    Liu, Ping
    Xu, Mingliang
    Yang, Yi
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (09) : 8851 - 8861
  • [33] A Decentralized Approach to Intrusion Detection in Dynamic Networks of the Internet of Things Based on Multiagent Reinforcement Learning with Interagent Interaction
    Kalinin, M. O.
    Tkacheva, E. I.
    AUTOMATIC CONTROL AND COMPUTER SCIENCES, 2023, 57 (08) : 1025 - 1032
  • [34] Evolutionary Framework With Reinforcement Learning-Based Mutation Adaptation
    Sallam, Karam M.
    Elsayed, Saber M.
    Chakrabortty, Ripon K.
    Ryan, Michael J.
    IEEE ACCESS, 2020, 8 : 194045 - 194071
  • [35] A Collaborative Multiagent Reinforcement Learning Method Based on Policy Gradient Potential
    Zhang, Zhen
    Ong, Yew-Soon
    Wang, Dongqing
    Xue, Binqiang
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (02) : 1015 - 1027
  • [36] A Decentralized Approach to Intrusion Detection in Dynamic Networks of the Internet of Things Based on Multiagent Reinforcement Learning with Interagent Interaction
    M. O. Kalinin
    E. I. Tkacheva
    Automatic Control and Computer Sciences, 2023, 57 : 1025 - 1032
  • [37] Multi-Agent Reinforcement Learning-Based Decentralized Spectrum Access in Vehicular Networks With Emergent Communication
    Xiang, Ping
    Shan, Hangguan
    Su, Zhou
    Zhang, Zhaoyang
    Chen, Chen
    Li, Er-Ping
    IEEE COMMUNICATIONS LETTERS, 2023, 27 (01) : 195 - 199
  • [38] Interpreting Primal-Dual Algorithms for Constrained Multiagent Reinforcement Learning
    Tabas, Daniel
    Zamzam, Ahmed S.
    Zhang, Baosen
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [39] Optimal tracking agent: a new framework of reinforcement learning for multiagent systems
    Cao, Weihua
    Chen, Gang
    Chen, Xin
    Wu, Min
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2013, 25 (14) : 2002 - 2015
  • [40] A study of multiagent reinforcement learning based on quantum theory
    Meng Xiangping
    Pi Yuzhen
    Yuan Quande
    Pan Ying
    2006 IMACS: MULTICONFERENCE ON COMPUTATIONAL ENGINEERING IN SYSTEMS APPLICATIONS, VOLS 1 AND 2, 2006, : 1990 - +