An Information-Assisted Deep Reinforcement Learning Path Planning Scheme for Dynamic and Unknown Underwater Environment

被引：6

作者：

Xi, Meng ^{[1
]}

Yang, Jiachen ^{[1
]}

Wen, Jiabao ^{[1
]}

Li, Zhengjian ^{[1
]}

Lu, Wen ^{[2
]}

Gao, Xinbo ^{[2
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Xidian Univ, Sch Elect Engn, Xian 710071, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2025年 / 36卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Heuristic algorithms; Path planning; Reinforcement learning; Robustness; Neural networks; Vehicle dynamics; Oceans; Autonomous underwater vehicle (AUV); dynamic environment; path planning; reinforcement learning; robustness; TRACKING CONTROL; VEHICLES; ALGORITHM; LEVEL; AUV;

D O I：

10.1109/TNNLS.2023.3332172

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

An autonomous underwater vehicle (AUV) has shown impressive potential and promising exploitation prospects in numerous marine missions. Among its various applications, the most essential prerequisite is path planning. Although considerable endeavors have been made, there are several limitations. A complete and realistic ocean simulation environment is critically needed. As most of the existing methods are based on mathematical models, they suffer from a large gap with reality. At the same time, the dynamic and unknown environment places high demands on robustness and generalization. In order to overcome these limitations, we propose an information-assisted reinforcement learning path planning scheme. First, it performs numerical modeling based on real ocean current observations to establish a complete simulation environment with the grid method, including 3-D terrain, dynamic currents, local information, and so on. Next, we propose an information compression (IC) scheme to trim the mutual information (MI) between reinforcement learning neural network layers to improve generalization. A proof based on information theory provides solid support for this. Moreover, for the dynamic characteristics of the marine environment, we elaborately design a confidence evaluator (CE), which evaluates the correlation between two adjacent frames of ocean currents to provide confidence for the action. The performance of our method has been evaluated and proven by numerical results, which demonstrate a fair sensitivity to ocean currents and high robustness and generalization to cope with the dynamic and unknown underwater environment.

引用

页码：842 / 853

页数：12

共 42 条

[21] Detecting Novel Associations in Large Data Sets
Reshef, David N.
Reshef, Yakir A.
Finucane, Hilary K.
Grossman, Sharon R.
McVean, Gilean
Turnbaugh, Peter J.
Lander, Eric S.
Mitzenmacher, Michael
Sabeti, Pardis C.
[J]. SCIENCE, 2011, 334 (6062) : 1518 - 1524
[22] Comparison of Parallel Genetic Algorithm and Particle Swarm Optimization for Real-Time UAV Path Planning
Roberge, Vincent
Tarbouchi, Mohammed
Labonte, Gilles
[J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2013, 9 (01) : 132 - 141
[23] Autosub Long Range 6000: A Multiple-Month Endurance AUV for Deep-Ocean Monitoring and Survey
Roper, Daniel
Harris, Catherine A.
Salavasidis, Georgios
Pebody, Miles
Templeton, Robert
Prampart, Thomas
Kingsland, Matthew
Morrison, Richard
Furlong, Maaten
Phillips, Alexander B.
McPhail, Stephen
[J]. IEEE JOURNAL OF OCEANIC ENGINEERING, 2021, 46 (04) : 1179 - 1191
[24] Schulman J, 2017, Arxiv, DOI arXiv:1707.06347
[25] End-to-End Navigation Strategy With Deep Reinforcement Learning for Mobile Robots
Shi, Haobin
Shi, Lin
Xu, Meng
Hwang, Kao-Shing
[J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (04) : 2393 - 2402
[26] Multi Pseudo Q-Learning-Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles
Shi, Wenjie
Song, Shiji
Wu, Cheng
Chen, C. L. Philip
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (12) : 3534 - 3546
[27] A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
Silver, David
Hubert, Thomas
Schrittwieser, Julian
Antonoglou, Ioannis
Lai, Matthew
Guez, Arthur
Lanctot, Marc
Sifre, Laurent
Kumaran, Dharshan
Graepel, Thore
Lillicrap, Timothy
Simonyan, Karen
Hassabis, Demis
[J]. SCIENCE, 2018, 362 (6419) : 1140 - +
[28] Grandmaster level in StarCraft II using multi-agent reinforcement learning
Vinyals, Oriol
Babuschkin, Igor
Czarnecki, Wojciech M.
Mathieu, Michael
Dudzik, Andrew
Chung, Junyoung
Choi, David H.
Powell, Richard
Ewalds, Timo
Georgiev, Petko
Oh, Junhyuk
Horgan, Dan
Kroiss, Manuel
Danihelka, Ivo
Huang, Aja
Sifre, Laurent
Cai, Trevor
Agapiou, John P.
Jaderberg, Max
Vezhnevets, Alexander S.
Leblond, Remi
Pohlen, Tobias
Dalibard, Valentin
Budden, David
Sulsky, Yury
Molloy, James
Paine, Tom L.
Gulcehre, Caglar
Wang, Ziyu
Pfaff, Tobias
Wu, Yuhuai
Ring, Roman
Yogatama, Dani
Wunsch, Dario
McKinney, Katrina
Smith, Oliver
Schaul, Tom
Lillicrap, Timothy
Kavukcuoglu, Koray
Hassabis, Demis
Apps, Chris
Silver, David
[J]. NATURE, 2019, 575 (7782) : 350 - +
[29] Data-Driven Performance-Prescribed Reinforcement Learning Control of an Unmanned Surface Vehicle
Wang, Ning
Gao, Ying
Zhang, Xuefeng
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (12) : 5456 - 5467
[30] Target Tracking Control of a Biomimetic Underwater Vehicle Through Deep Reinforcement Learning
Wang, Yu
Tang, Chong
Wang, Shuo
Cheng, Long
Wang, Rui
Tan, Min
Hou, Zengguang
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3741 - 3752

← 1 2 3 4 5 →