Coverage path planning for maritime search and rescue using reinforcement learning

被引：72

作者：

Ai, Bo ^{[1
]}

Jia, Maoxin ^{[1
]}

Xu, Hanwen ^{[1
]}

Xu, Jiangling ^{[2
]}

Wen, Zhen ^{[1
]}

Li, Benshuai ^{[1
,3
]}

Zhang, Dan ^{[1
]}

机构：

[1] Shandong Univ Sci & Technol, Coll Geodesy & Geomat, Qingdao 266590, Peoples R China

[2] State Ocean Adm, North China Sea Marine Forecasting Ctr, Qingdao 266100, Peoples R China

[3] Qingdao Yuehai Informat Serv Co Ltd, Qingdao 266590, Peoples R China

来源：

OCEAN ENGINEERING | 2021年 / 241卷

基金：

中国国家自然科学基金;

关键词：

Maritime search and rescue; Reinforcement learning; Coverage path planning; Intelligent planning; APPROXIMATION; ALGORITHMS;

D O I：

10.1016/j.oceaneng.2021.110098

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

In maritime search and rescue (SAR), the planning of the search path will directly affect the efficiency of searching for people overboard in the search area. However, traditional SAR decision-making schemes often adopt a fixed search path planning mode, but the limits are poor flexibility, low efficiency, and insufficient intelligence. This paper plans a search path with the shortest time-consuming and priority coverage of highprobability areas, considering complete coverage of maritime SAR areas and avoiding maritime obstacles. Firstly, a maritime SAR environment model is built using marine environmental field data and electronic charts. Secondly, an autonomous coverage path planning model for maritime SAR is proposed based on reinforcement learning, in which a reward function with multiple constraints is designed to guide the navigation action of the vessel agent. In the iterative training process of the path planning model, the random action selection probability is dynamically adjusted by the nonlinear action selection policy to ensure the stable convergence of the model. Finally, the experimental verification is conducted in different small-scale maritime SAR simulation scenarios. The results indicate that the search path can cover the high-probability areas preferentially with lower repeated coverage and shorter path length compared with other path planning algorithms.

引用

页数：11

共 36 条

[1] A Constraint Optimization Approach for the Allocation of Multiple Search Units in Search and Rescue Operations [J].