A Projection-based Exploration Method for Multi-Agent Coordination

被引:0
|
作者
Tang, Hainan [1 ]
Liu, Juntao [1 ]
Wang, Zhenjie [1 ]
Gao, Ziwen [1 ]
Li, You [2 ]
机构
[1] Wuhan Digital Engn Inst, Wuhan, Hubei, Peoples R China
[2] Hubei Univ, Wuhan, Hubei, Peoples R China
关键词
Projection Exploration; Multi-agent Coordination; Maximum distribution entropy;
D O I
10.1145/3669721.3669723
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In multi-agent reinforcement learning (MARL), states with high exploration value are difficult to be identified and coordinately visited, resulting in low learning efficiency. To this end, a projection-based exploration method for multi-agent coordination (PEMAC) is proposed. Goal states are selected using the count-based approach in the optimal projection space, of which the entropy of state distribution is maximal. Then, by reshaping the rewards in the replay buffer, agents are trained to visit those high-value states in a coordinated manner. In order to verify the effectiveness of the proposed method, comparative experiments are conducted in the multi-particle environment (MPE), in which dense-reward and sparse-reward settings are all both considered. Corresponding results suggest that PEMAC can effectively improve learning efficiency.
引用
收藏
页码:8 / 14
页数:7
相关论文
共 50 条
  • [1] On fuzzy projection-based utility decomposition in compound multi-agent negotiations
    Brzostowski, Jakub
    Kowalczyk, Ryszard
    FOUNDATIONS OF FUZZY LOGIC AND SOFT COMPUTING, PROCEEDINGS, 2007, 4529 : 757 - +
  • [2] Projection-Based Model Reduction of Multi-Agent Systems Using Graph Partitions
    Monshizadeh, Nima
    Trentelman, Harry L.
    Camlibel, M. Kanat
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2014, 1 (02): : 145 - 154
  • [3] Application of multi-agent paradigm to hp-adaptive projection-based interpolation operator
    Gurgul, Piotr
    Sieniek, Marcin
    Magiera, Krzysztof
    Skotniczny, Marcin
    JOURNAL OF COMPUTATIONAL SCIENCE, 2013, 4 (03) : 164 - 169
  • [4] Projection-Based Consensus for Continuous-Time Multi-Agent Systems with State Constraints
    Wang, Xiaofeng
    Zhou, Zheqing
    2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 1060 - 1065
  • [5] A projection-based continuous-time algorithm for distributed optimization over multi-agent systems
    Xingnan Wen
    Sitian Qin
    Complex & Intelligent Systems, 2022, 8 : 719 - 729
  • [6] A projection-based continuous-time algorithm for distributed optimization over multi-agent systems
    Wen, Xingnan
    Qin, Sitian
    COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (02) : 719 - 729
  • [7] Utilizing costly coordination in multi-agent joint exploration
    Rochlin, Igor
    Sarne, David
    MULTIAGENT AND GRID SYSTEMS, 2014, 10 (01) : 23 - 49
  • [8] Multi-agent coordination strategy estimation method based on control domain
    曙光
    洪炳熔
    Journal of Harbin Institute of Technology, 2001, (03) : 249 - 252
  • [9] Multi-Agent Coordination Method Based on Fuzzy Q-Learning
    Peng, Jun
    Liu, Miao
    Wu, Min
    Zhang, Xiaoyong
    Lin, Kuo-Chi
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 5411 - +
  • [10] A Cooperative Multi-Agent Reinforcement Learning Method Based on Coordination Degree
    Cui, Haoyan
    Zhang, Zhen
    IEEE ACCESS, 2021, 9 : 123805 - 123814