Autonomous Underwater Vehicle Path Planning Method of Soft Actor-Critic Based on Game Training

被引：6

作者：

Wang, Zhuo ^{[1
]}

Lu, Hao ^{[1
]}

Qin, Hongde ^{[1
]}

Sui, Yancheng ^{[1
]}

机构：

[1] Harbin Engn Univ, Sch Naval Engn, Harbin 150001, Peoples R China

来源：

JOURNAL OF MARINE SCIENCE AND ENGINEERING | 2022年 / 10卷 / 12期

关键词：

autonomous underwater vehicle; optimal path planning; deep reinforcement learning; unknown underwater environment; particle swarm optimization; REINFORCEMENT; ENVIRONMENT; AUV; UAV;

D O I：

10.3390/jmse10122018

中图分类号：

U6 [水路运输]; P75 [海洋工程];

学科分类号：

0814 ; 081505 ; 0824 ; 082401 ;

摘要：

This study aims to solve the issue of the safe navigation of autonomous underwater vehicles (AUVs) in an unknown underwater environment. AUV will encounter canyons, rocks, reefs, fish, and underwater vehicles that threaten its safety during underwater navigation. A game-based soft actor-critic (GSAC) path planning method is proposed in this study to improve the adaptive capability of autonomous planning and the reliability of obstacle avoidance in the unknown underwater environment. Considering the influence of the simulation environment, the obstacles in the simulation environment are regarded as agents and play a zero-sum game with the AUV. The zero-sum game problem is solved by improving the strategy of AUV and obstacles, so that the simulation environment evolves intelligently with the AUV path planning strategy. The proposed method increases the complexity and diversity of the simulation environment, enables AUV to train in a variable environment specific to its strategy, and improves the adaptability and convergence speed of AUV in unknown underwater environments. Finally, the Python language is applied to write an unknown underwater simulation environment for the AUV simulation testing. GSAC can guide the AUV to the target point in the unknown underwater environment while avoiding large and small static obstacles, canyons, and small dynamic obstacles. Compared with the soft actor-critic(SAC) and the deep Q-network (DQN) algorithm, GSAC has better adaptability and convergence speed in the unknown underwater environment. The experiments verifies that GSAC has faster convergence, better stability, and robustness in unknown underwater environments.

引用

页数：22

共 47 条

[1] Zermelo's problem: Optimal point-to-point navigation in 2D turbulent flows using reinforcement learning [J].

Biferale, L. ;

Bonaccorso, F. ;

Buzzicotti, M. ;

Di Leoni, P. Clark ;

Gustavsson, K. .

CHAOS, 2019, 29 (10)

[2] Target Search Control of AUV in Underwater Environment With Deep Reinforcement Learning [J].

Cao, Xiang ;

Sun, Changyin ;

Yan, Mingzhong .

IEEE ACCESS, 2019, 7 :96549-96559

[3] A Review of Risk Analysis Research for the Operations of Autonomous Underwater Vehicles [J].

Chen, Xi ;

Bose, Neil ;

Brito, Mario ;

Khan, Faisal ;

Thanyamanta, Bo ;

Zou, Ting .

RELIABILITY ENGINEERING & SYSTEM SAFETY, 2021, 216

[4] Path planning and obstacle avoidance for AUV: A review [J].

Cheng, Chunxi ;

Sha, Qixin ;

He, Bo ;

Li, Guangliang .

OCEAN ENGINEERING, 2021, 235

[5] UAV Path Planning Based on Multi-Layer Reinforcement Learning Technique [J].

Cui, Zhengyang ;

Wang, Yong .

IEEE ACCESS, 2021, 9 :59486-59497

[6] Neural networks based reinforcement learning for mobile robots obstacle avoidance [J].

Duguleana, Mihai ;

Mogan, Gheorghe .

EXPERT SYSTEMS WITH APPLICATIONS, 2016, 62 :104-115

[7] Improved Artificial Potential Field Method Applied for AUV Path Planning [J].

Fan, Xiaojing ;

Guo, Yinjing ;

Liu, Hui ;

Wei, Bowen ;

Lyu, Wenhong .

MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020

[8]

Fujimoto S, 2018, PR MACH LEARN RES, V80

[9]

Haarnoja T, 2018, PR MACH LEARN RES, V80

[10] Research for UAV Path Planning Method Based on Guided Sarsa Algorithm [J].

He Boming ;

Lin Wei ;

Mei Fuzeng ;

Fan Huahao .

2022 2ND IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE (SEAI 2022), 2022, :220-224

← 1 2 3 4 5 →