An Information-Assisted Deep Reinforcement Learning Path Planning Scheme for Dynamic and Unknown Underwater Environment

被引:6
作者
Xi, Meng [1 ]
Yang, Jiachen [1 ]
Wen, Jiabao [1 ]
Li, Zhengjian [1 ]
Lu, Wen [2 ]
Gao, Xinbo [2 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Xidian Univ, Sch Elect Engn, Xian 710071, Peoples R China
基金
中国国家自然科学基金;
关键词
Heuristic algorithms; Path planning; Reinforcement learning; Robustness; Neural networks; Vehicle dynamics; Oceans; Autonomous underwater vehicle (AUV); dynamic environment; path planning; reinforcement learning; robustness; TRACKING CONTROL; VEHICLES; ALGORITHM; LEVEL; AUV;
D O I
10.1109/TNNLS.2023.3332172
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An autonomous underwater vehicle (AUV) has shown impressive potential and promising exploitation prospects in numerous marine missions. Among its various applications, the most essential prerequisite is path planning. Although considerable endeavors have been made, there are several limitations. A complete and realistic ocean simulation environment is critically needed. As most of the existing methods are based on mathematical models, they suffer from a large gap with reality. At the same time, the dynamic and unknown environment places high demands on robustness and generalization. In order to overcome these limitations, we propose an information-assisted reinforcement learning path planning scheme. First, it performs numerical modeling based on real ocean current observations to establish a complete simulation environment with the grid method, including 3-D terrain, dynamic currents, local information, and so on. Next, we propose an information compression (IC) scheme to trim the mutual information (MI) between reinforcement learning neural network layers to improve generalization. A proof based on information theory provides solid support for this. Moreover, for the dynamic characteristics of the marine environment, we elaborately design a confidence evaluator (CE), which evaluates the correlation between two adjacent frames of ocean currents to provide confidence for the action. The performance of our method has been evaluated and proven by numerical results, which demonstrate a fair sensitivity to ocean currents and high robustness and generalization to cope with the dynamic and unknown underwater environment.
引用
收藏
页码:842 / 853
页数:12
相关论文
共 42 条
  • [21] Detecting Novel Associations in Large Data Sets
    Reshef, David N.
    Reshef, Yakir A.
    Finucane, Hilary K.
    Grossman, Sharon R.
    McVean, Gilean
    Turnbaugh, Peter J.
    Lander, Eric S.
    Mitzenmacher, Michael
    Sabeti, Pardis C.
    [J]. SCIENCE, 2011, 334 (6062) : 1518 - 1524
  • [22] Comparison of Parallel Genetic Algorithm and Particle Swarm Optimization for Real-Time UAV Path Planning
    Roberge, Vincent
    Tarbouchi, Mohammed
    Labonte, Gilles
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2013, 9 (01) : 132 - 141
  • [23] Autosub Long Range 6000: A Multiple-Month Endurance AUV for Deep-Ocean Monitoring and Survey
    Roper, Daniel
    Harris, Catherine A.
    Salavasidis, Georgios
    Pebody, Miles
    Templeton, Robert
    Prampart, Thomas
    Kingsland, Matthew
    Morrison, Richard
    Furlong, Maaten
    Phillips, Alexander B.
    McPhail, Stephen
    [J]. IEEE JOURNAL OF OCEANIC ENGINEERING, 2021, 46 (04) : 1179 - 1191
  • [24] Schulman J, 2017, Arxiv, DOI arXiv:1707.06347
  • [25] End-to-End Navigation Strategy With Deep Reinforcement Learning for Mobile Robots
    Shi, Haobin
    Shi, Lin
    Xu, Meng
    Hwang, Kao-Shing
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2020, 16 (04) : 2393 - 2402
  • [26] Multi Pseudo Q-Learning-Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles
    Shi, Wenjie
    Song, Shiji
    Wu, Cheng
    Chen, C. L. Philip
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (12) : 3534 - 3546
  • [27] A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play
    Silver, David
    Hubert, Thomas
    Schrittwieser, Julian
    Antonoglou, Ioannis
    Lai, Matthew
    Guez, Arthur
    Lanctot, Marc
    Sifre, Laurent
    Kumaran, Dharshan
    Graepel, Thore
    Lillicrap, Timothy
    Simonyan, Karen
    Hassabis, Demis
    [J]. SCIENCE, 2018, 362 (6419) : 1140 - +
  • [28] Grandmaster level in StarCraft II using multi-agent reinforcement learning
    Vinyals, Oriol
    Babuschkin, Igor
    Czarnecki, Wojciech M.
    Mathieu, Michael
    Dudzik, Andrew
    Chung, Junyoung
    Choi, David H.
    Powell, Richard
    Ewalds, Timo
    Georgiev, Petko
    Oh, Junhyuk
    Horgan, Dan
    Kroiss, Manuel
    Danihelka, Ivo
    Huang, Aja
    Sifre, Laurent
    Cai, Trevor
    Agapiou, John P.
    Jaderberg, Max
    Vezhnevets, Alexander S.
    Leblond, Remi
    Pohlen, Tobias
    Dalibard, Valentin
    Budden, David
    Sulsky, Yury
    Molloy, James
    Paine, Tom L.
    Gulcehre, Caglar
    Wang, Ziyu
    Pfaff, Tobias
    Wu, Yuhuai
    Ring, Roman
    Yogatama, Dani
    Wunsch, Dario
    McKinney, Katrina
    Smith, Oliver
    Schaul, Tom
    Lillicrap, Timothy
    Kavukcuoglu, Koray
    Hassabis, Demis
    Apps, Chris
    Silver, David
    [J]. NATURE, 2019, 575 (7782) : 350 - +
  • [29] Data-Driven Performance-Prescribed Reinforcement Learning Control of an Unmanned Surface Vehicle
    Wang, Ning
    Gao, Ying
    Zhang, Xuefeng
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (12) : 5456 - 5467
  • [30] Target Tracking Control of a Biomimetic Underwater Vehicle Through Deep Reinforcement Learning
    Wang, Yu
    Tang, Chong
    Wang, Shuo
    Cheng, Long
    Wang, Rui
    Tan, Min
    Hou, Zengguang
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3741 - 3752