A multi-agent reinforcement learning algorithm with the action preference selection strategy for massive target cooperative search mission planning

被引:12
作者
Wang, Xiaoyan [1 ]
Fang, Xi [1 ]
机构
[1] Wuhan Univ Technol, Sch Sci, Wuhan 430070, Peoples R China
关键词
Reinforce algorithm; Multi-agent; Cooperative target search; Action selection strategy; ROBOTIC SEARCH; SWARM; PSO; OPTIMIZATION; GO;
D O I
10.1016/j.eswa.2023.120643
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Target search is widely applied in military reconnaissance, geological exploration and personnel search and rescue. Most target search algorithms perform well in single target search but are inefficient or even ineffective in multi-target search. To solve the multi-target search problem in an uncertain environment, this paper constructs a multi-agent massive target cooperative search mission planning model and proposes an improved reinforcement learning algorithm using the action preference selection strategy. Based on the Reinforce algorithm, this algorithm solves the problem of invalid searches within the stochastic strategy by changing the preferred action selection method. The proposed method improves the efficiency of multiple agents in the search for targets without collision using a cooperative mechanism and reward rules based on the odor effect. Simulation experiments are conducted in three aspects to verify the effectiveness and robustness of the improved algorithm and compare it with other reinforcement learning algorithms in the field of multi-agent learning. The results demonstrate that the improved algorithm has obvious advantages in terms of mission success rate, target search rate and average search time, and the movement trajectory of multiple agents is more concise.
引用
收藏
页数:19
相关论文
共 50 条
[1]   Path planning for robotic demining: Robust sensor-based coverage of unstructured environments and probabilistic methods [J].
Acar, EU ;
Choset, H ;
Zhang, YG ;
Schervish, M .
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2003, 22 (7-8) :441-466
[2]  
Ataei H.N., 2013, INT C IND ENG OTH AP, P312, DOI DOI 10.1007/978-3-642-38577-3_32
[3]   A PSO-Based Approach with Fuzzy Obstacle Avoidance for Cooperative Multi-Robots in Unknown Environments [J].
Cai, Yifan ;
Yang, Simon X. .
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2016, 15 (01)
[4]  
Cai YF, 2013, IEEE SYMP ADAPT DYNA, P52, DOI 10.1109/ADPRL.2013.6614989
[5]   An improved PSO-based approach with dynamic parameter tuning for cooperative multi-robot target searching in complex unknown environments [J].
Cai, Yifan ;
Yang, Simon X. .
INTERNATIONAL JOURNAL OF CONTROL, 2013, 86 (10) :1720-1732
[6]   Multi-AUV cooperative target search and tracking in unknown underwater environment [J].
Cao, Xiang ;
Sun, Hongbing ;
Jan, Gene Eu .
OCEAN ENGINEERING, 2018, 150 :1-11
[7]   A reinforcement learning based artificial bee colony algorithm with application in robot path planning [J].
Cui, Yibing ;
Hu, Wei ;
Rahmani, Ahmed .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 203
[8]   A PSO-based multi-robot cooperation method for target searching in unknown environments [J].
Dadgar, Masoud ;
Jafari, Shahram ;
Hamzeh, Ali .
NEUROCOMPUTING, 2016, 177 :62-74
[9]  
Daoun Dema, 2022, International Conference on Deep Learning, Big Data and Blockchain (Deep-BDB 2021). Lecture Notes in Networks and Systems (309), P134, DOI 10.1007/978-3-030-84337-3_11
[10]   Behavior-based swarm robotic search and rescue using fuzzy controller [J].
Din, Ahmad ;
Jabeen, Meh ;
Zia, Kashif ;
Khalid, Abbas ;
Saini, Dinesh Kumar .
COMPUTERS & ELECTRICAL ENGINEERING, 2018, 70 :53-65