Autonomous underwater vehicle path planning based on actor-multi-critic reinforcement learning

被引：16

作者：

Wang, Zhuo ^{[1
,2
]}

Zhang, Shiwei ^{[1
]}

Feng, Xiaoning ^{[3
]}

Sui, Yancheng ^{[1
]}

机构：

[1] Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin, Peoples R China

[2] Peng Cheng Lab, Shenzhen, Peoples R China

[3] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin 150001, Peoples R China

来源：

PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART I-JOURNAL OF SYSTEMS AND CONTROL ENGINEERING | 2021年 / 235卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Autonomous underwater vehicle; path planning; dynamic obstacle avoidance; actor-critic; neural networks; FEEDFORWARD NETWORKS; ENVIRONMENT;

D O I：

10.1177/0959651820937085

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The environmental adaptability of autonomous underwater vehicles is always a problem for its path planning. Although reinforcement learning can improve the environmental adaptability, the slow convergence of reinforcement learning is caused by multi-behavior coupling, so it is difficult for autonomous underwater vehicle to avoid moving obstacles. This article proposes a multi-behavior critic reinforcement learning algorithm applied to autonomous underwater vehicle path planning to overcome problems associated with oscillating amplitudes and low learning efficiency in the early stages of training which are common in traditional actor-critic algorithms. Behavior critic reinforcement learning assesses the actions of the actor from perspectives such as energy saving and security, combining these aspects into a whole evaluation of the actor. In this article, the policy gradient method is selected as the actor part, and the value function method is selected as the critic part. The strategy gradient and the value function methods for actor and critic, respectively, are approximated by a backpropagation neural network, the parameters of which are updated using the gradient descent method. The simulation results show that the method has the ability of optimizing learning in the environment and can improve learning efficiency, which meets the needs of real time and adaptability for autonomous underwater vehicle dynamic obstacle avoidance.

引用

页码：1787 / 1796

页数：10

共 50 条

[1] Autonomous Underwater Vehicle Path Planning Method of Soft Actor-Critic Based on Game Training
Wang, Zhuo
Lu, Hao
Qin, Hongde
Sui, Yancheng
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (12)
[2] Path planning of autonomous underwater vehicle in unknown environment based on improved deep reinforcement learning
Tang, Zhicheng
Cao, Xiang
Zhou, Zihan
Zhang, Zhoubin
Xu, Chen
Dou, Jianbin
OCEAN ENGINEERING, 2024, 301
[3] A novel reinforcement learning based tuna swarm optimization algorithm for autonomous underwater vehicle path planning
Yan, Zheping
Yan, Jinyu
Wu, Yifan
Cai, Sijia
Wang, Hongxing
MATHEMATICS AND COMPUTERS IN SIMULATION, 2023, 209 : 55 - 86
[4] Path Planning for the Autonomous Underwater Vehicle
Kirsanov, Andrey
Anavatti, Sreenatha G.
Ray, Tapabrata
SWARM, EVOLUTIONARY, AND MEMETIC COMPUTING, PT II (SEMCCO 2013), 2013, 8298 : 476 - 486
[5] An adaptive PID controller for path following of autonomous underwater vehicle based on Soft Actor-Critic
Wang, Yuxuan
Hou, Yaochun
Lai, Zhounian
Cao, Linlin
Hong, Weirong
Wu, Dazhuan
OCEAN ENGINEERING, 2024, 307
[6] A Novel Path Planning Method Based on Extreme Learning Machine for Autonomous Underwater Vehicle
Dong, Diya
He, Bo
Liu, Yang
Nian, Rui
Yan, Tianhong
OCEANS 2015 - MTS/IEEE WASHINGTON, 2015,
[7] Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle
Zhang, Qilei
Lin, Jinying
Sha, Qixin
He, Bo
Li, Guangliang
IEEE ACCESS, 2020, 8 : 24258 - 24268
[8] PATH PLANNING FOR AN AUTONOMOUS UNDERWATER VEHICLE IN A CLUTTERED UNDERWATER ENVIRONMENT BASED ON THE HEAT METHOD
Sun, Kaiyue
Liu, Xiangyang
INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS AND COMPUTER SCIENCE, 2021, 31 (02) : 289 - 301
[9] A Novel Path Planning Algorithm for Autonomous Underwater Vehicle
Yin, Bo
Liu, Bing
Cao, Jing
ADVANCED RESEARCH IN MATERIAL SCIENCE AND MECHANICAL ENGINEERING, PTS 1 AND 2, 2014, 446-447 : 1271 - 1278
[10] Path Planning Based on Deep Reinforcement Learning for Autonomous Underwater Vehicles Under Ocean Current Disturbance
Chu, Zhenzhong
Wang, Fulun
Lei, Tingjun
Luo, Chaomin
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (01): : 108 - 120

← 1 2 3 4 5 →