Autonomous underwater vehicle path planning based on actor-multi-critic reinforcement learning

被引:16
作者
Wang, Zhuo [1 ,2 ]
Zhang, Shiwei [1 ]
Feng, Xiaoning [3 ]
Sui, Yancheng [1 ]
机构
[1] Harbin Engn Univ, Sci & Technol Underwater Vehicle Lab, Harbin, Peoples R China
[2] Peng Cheng Lab, Shenzhen, Peoples R China
[3] Harbin Engn Univ, Coll Comp Sci & Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous underwater vehicle; path planning; dynamic obstacle avoidance; actor-critic; neural networks; FEEDFORWARD NETWORKS; ENVIRONMENT;
D O I
10.1177/0959651820937085
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The environmental adaptability of autonomous underwater vehicles is always a problem for its path planning. Although reinforcement learning can improve the environmental adaptability, the slow convergence of reinforcement learning is caused by multi-behavior coupling, so it is difficult for autonomous underwater vehicle to avoid moving obstacles. This article proposes a multi-behavior critic reinforcement learning algorithm applied to autonomous underwater vehicle path planning to overcome problems associated with oscillating amplitudes and low learning efficiency in the early stages of training which are common in traditional actor-critic algorithms. Behavior critic reinforcement learning assesses the actions of the actor from perspectives such as energy saving and security, combining these aspects into a whole evaluation of the actor. In this article, the policy gradient method is selected as the actor part, and the value function method is selected as the critic part. The strategy gradient and the value function methods for actor and critic, respectively, are approximated by a backpropagation neural network, the parameters of which are updated using the gradient descent method. The simulation results show that the method has the ability of optimizing learning in the environment and can improve learning efficiency, which meets the needs of real time and adaptability for autonomous underwater vehicle dynamic obstacle avoidance.
引用
收藏
页码:1787 / 1796
页数:10
相关论文
共 50 条
  • [41] Cognition-based hybrid path planning for autonomous underwater vehicle target following
    Ming, Li Yue
    Huang Hai
    Xu Yang
    Zhang Guocheng
    Li Jiyong
    Qin Hongde
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2019, 16 (04)
  • [42] Selector-Actor-Critic and Tuner-Actor-Critic Algorithms for Reinforcement Learning
    Masadeh, Ala'eddin
    Wang, Zhengdao
    Kamal, Ahmed E.
    2019 11TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING (WCSP), 2019,
  • [43] Sampling-efficient path planning and improved actor-critic-based obstacle avoidance for autonomous robots
    Yang, Yefeng
    Huang, Tao
    Wang, Tianqi
    Yang, Wenyu
    Chen, Han
    Li, Boyang
    Wen, Chih-yung
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (05)
  • [44] Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning
    Anderlini, Enrico
    Parker, Gordon G.
    Thomas, Giles
    APPLIED SCIENCES-BASEL, 2019, 9 (17):
  • [45] Adaptive Path Planning for Subsurface Plume Tracing with an Autonomous Underwater Vehicle
    Wu, Zhiliang
    Wang, Shuozi
    Shao, Xusong
    Liu, Fang
    Bao, Zefeng
    ROBOTICS, 2024, 13 (09)
  • [46] Composite Astar and B-spline algorithm for path planning of Autonomous Underwater Vehicle
    Wang, Zhao
    Xiang, Xianbo
    Yang, Jun
    Yang, Shaorong
    2017 IEEE 7TH INTERNATIONAL CONFERENCE ON UNDERWATER SYSTEM TECHNOLOGY: THEORY AND APPLICATIONS (USYS), 2017,
  • [47] Path planning for autonomous underwater vehicle in time-varying current
    Cao, Xiang
    Sun, Chang-yin
    Chen, Ming-zhi
    IET INTELLIGENT TRANSPORT SYSTEMS, 2019, 13 (08) : 1265 - 1271
  • [48] Path Planning for Autonomous Underwater Vehicle Docking in Stationary Obstacle Environment
    Liu, Chenzhan
    Fan, Shuangshuang
    Li, Bo
    Chen, Shumin
    Xu, Yuanxin
    Xu, Wen
    OCEANS 2016 - SHANGHAI, 2016,
  • [49] Joint localisation and tracking for autonomous underwater vehicle: a reinforcement learning-based approach
    Yan, Jing
    Li, Xin
    Luo, Xiaoyuan
    Gong, Yadi
    Guan, Xinping
    IET CONTROL THEORY AND APPLICATIONS, 2019, 13 (17) : 2856 - 2865
  • [50] AHAC: Actor Hierarchical Attention Critic for Multi-Agent Reinforcement Learning
    Wang, Yajie
    Shi, Dianxi
    Xue, Chao
    Jiang, Hao
    Wang, Gongju
    Gong, Peng
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 3013 - 3020