Probabilistic Reward-Based Reinforcement Learning for Multi-Agent Pursuit and Evasion

被引：1

作者：

Zhang, Bo-Kun ^{[1
]}

Hu, Bin ^{[1
]}

Chen, Long ^{[1
]}

Zhang, Ding-Xue ^{[2
]}

Cheng, Xin-Ming ^{[3
]}

Guan, Zhi-Hong ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Artificial Intelligence & Automat, Wuhan 430074, Peoples R China

[2] Yangtze Univ, Sch Petr Engn, Jingzhou 434023, Peoples R China

[3] Cent South Univ, Sch Automat, Changsha 430083, Peoples R China

来源：

PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021) | 2021年

关键词：

Reinforcement learning; Multi-agent; Pursuit-evasion; Probabilistic reward; SYSTEMS;

D O I：

10.1109/CCDC52312.2021.9601771

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The reinforcement learning is studied to solve the problem of multi-agent pursuit and evasion games in this article. The main problem of current reinforcement learning for multi-agents is the low learning efficiency of agents. An important factor leading to this problem is that the delay of the Q function is related to the environment changing. To solve this problem, a probabilistic distribution reward value is used to replace the Q function in the multi-agent depth deterministic policy gradient framework (hereinafter referred to as MADDPG). The distribution Bellman equation is proved to be convergent, and can be brought into the framework of reinforcement learning algorithm. The probabilistic distribution reward value is updated in the algorithm, so that the reward value can be more adaptive to the complex environment. In the same time, eliminating the delay of rewards improves the efficiency of the strategy and obtains a better pursuit-evasion results. The final simulation and experiment show that the multi-agent algorithm with distribution rewards achieves better results under the setting environment.

引用

页码：3352 / 3357

页数：6

共 50 条

[1] Multi-agent pursuit and evasion games based on improved reinforcement learning
Xue Y.-L.
Ye J.-Z.
Li H.-Y.
Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2023, 57 (08): : 1479 - 1486and1515
[2] An Approach to Multi-Agent Pursuit Evasion Games Using Reinforcement Learning
Bilgin, Ahmet Tunc
Kadioglu-Urtis, Esra
PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS (ICAR), 2015, : 164 - 169
[3] Pursuit-Evasion Games for Multi-agent Based on Reinforcement Learning with Obstacles
Hu, Penglin
Guo, Yaning
Hu, Jinwen
Pan, Quan
PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 1015 - 1024
[4] Multi-agent Reward-Based Intruder Capture
Grimaldi, Michele
Herpson, Cedric
INTELLIGENT DISTRIBUTED COMPUTING XVI, IDC 2023, 2024, 1138 : 251 - 266
[5] Reward-based epigenetic learning algorithm for a decentralised multi-agent system
Mukhlish, Faqihza
Page, John
Bain, Michael
INTERNATIONAL JOURNAL OF INTELLIGENT UNMANNED SYSTEMS, 2020, 8 (03) : 201 - 224
[6] Extrinsic-and-Intrinsic Reward-Based Multi-Agent Reinforcement Learning for Multi-UAV Cooperative Target Encirclement
Chen, Jinchao
Wang, Yang
Zhang, Ying
Lu, Yantao
Shu, Qiuhao
Hu, Yujiao
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2025,
[7] Rationality of reward sharing in multi-agent reinforcement learning
Kazuteru Miyazaki
Shigenobu Kobayashi
New Generation Computing, 2001, 19 : 157 - 172
[8] Rationality of reward sharing in multi-agent reinforcement learning
Miyazaki, K
Kobayashi, S
NEW GENERATION COMPUTING, 2001, 19 (02) : 157 - 172
[9] Decentralized graph-based multi-agent reinforcement learning using reward machines
Hu, Jueming
Xu, Zhe
Wang, Weichang
Qu, Guannan
Pang, Yutian
Liu, Yongming
NEUROCOMPUTING, 2024, 564
[10] Emotion-Based Heterogeneous Multi-agent Reinforcement Learning with Sparse Reward
Fang B.
Ma Y.
Wang Z.
Wang H.
Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (03): : 223 - 231

← 1 2 3 4 5 →