High-level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-learning

被引:7
作者
Shi, Wenjie [1 ]
Song, Shiji [1 ]
Wu, Cheng [1 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2018年
基金
美国国家科学基金会;
关键词
autonomous underwater vehicle; trajectory tracking; reinforcement learning; policy gradient; actor-critic;
D O I
10.1109/SMC.2018.00701
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we investigate the trajectory tracking problem of underactuated autonomous underwater vehicles (AU-Vs) with input saturation. Our proposed model-free algorithm can realize high-level tracking control and stable learning by employing a novel actors-critics architecture, where a critic and multiple actors are learned to estimate the action-value function and deterministic policy, respectively. For the critic, Pseudo Averaged Q-learning, which is a simple extension to Q-learning, is proposed to calculate the target value, specifically, the action-value of next state is obtained by maximizing the average over the last multiple previous learned action-value estimates among all actors. As for the actors, deterministic policy gradient is applied to update the weights. The effectiveness and performance of the proposed Pseudo Averaged Q-learning based deterministic policy gradient (PAQ-DPG) algorithm is verified by implementation to an underactuated AUV. And the results demonstrate high-level tracking control accuracy and stability of learning of PAQ-DPG algorithm. Besides, under our proposed actors-critics framework, increasing the number of actors will further improve the performance.
引用
收藏
页码:4138 / 4143
页数:6
相关论文
共 50 条
  • [21] Error scaling-based adaptive region tracking control for autonomous underwater vehicles
    Liu, Hongwei
    Zhao, Wende
    Liu, Xing
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART I-JOURNAL OF SYSTEMS AND CONTROL ENGINEERING, 2023, 237 (10) : 1867 - 1883
  • [22] Predictive Trajectory Tracking Control of Autonomous Underwater Vehicles Based on Variable Fuzzy Predictor
    Jianchuan Yin
    Ning Wang
    International Journal of Fuzzy Systems, 2021, 23 : 1809 - 1822
  • [23] Path planning for autonomous mobile robot using transfer learning-based Q-learning
    Wu, Shengshuai
    Hu, Jinwen
    Zhao, Chunhui
    Pan, Quan
    PROCEEDINGS OF 2020 3RD INTERNATIONAL CONFERENCE ON UNMANNED SYSTEMS (ICUS), 2020, : 88 - 93
  • [24] Tabular Q-learning Based Reinforcement Learning Agent for Autonomous Vehicle Drift Initiation and Stabilization
    Toth, Szilard H.
    Bardos, Adam
    Viharos, Zsolt J.
    IFAC PAPERSONLINE, 2023, 56 (02): : 4896 - 4903
  • [25] A behavior-based scheme using reinforcement learning for autonomous underwater vehicles
    Carreras, M
    Yuh, J
    Batlle, J
    Ridao, P
    IEEE JOURNAL OF OCEANIC ENGINEERING, 2005, 30 (02) : 416 - 427
  • [26] Continuous interval type-2 fuzzy Q-learning algorithm for trajectory tracking tasks for vehicles
    Xuan, Chengbin
    Lam, Hak-Keung
    Shi, Qian
    Chen, Ming
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (08) : 4788 - 4815
  • [27] Supervised reinforcement learning based trajectory tracking control for autonomous vehicles
    Mihaly, Andras
    Van Tan Vu
    Trong Tu Do
    Gaspar, Peter
    IFAC PAPERSONLINE, 2024, 58 (10): : 140 - 145
  • [28] Prioritized experience replay based reinforcement learning for adaptive tracking control of autonomous underwater vehicle
    Li, Ting
    Yang, Dongsheng
    Xie, Xiangpeng
    APPLIED MATHEMATICS AND COMPUTATION, 2023, 443
  • [29] Trajectory tracking control for autonomous underwater vehicles based on dual closed-loop of MPC with uncertain dynamics
    Gong, Peng
    Yan, Zheping
    Zhang, Wei
    Tang, Jialing
    OCEAN ENGINEERING, 2022, 265
  • [30] Improved Adaptive High-Order Sliding Mode-Based Control for Trajectory Tracking of Autonomous Underwater Vehicles
    Guerrero, Jesus
    Chemori, Ahmed
    Creuze, Vincent
    Torres, Jorge
    IEEE JOURNAL OF OCEANIC ENGINEERING, 2024, 49 (04) : 1337 - 1349