High-level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-learning

被引:7
作者
Shi, Wenjie [1 ]
Song, Shiji [1 ]
Wu, Cheng [1 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2018年
基金
美国国家科学基金会;
关键词
autonomous underwater vehicle; trajectory tracking; reinforcement learning; policy gradient; actor-critic;
D O I
10.1109/SMC.2018.00701
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we investigate the trajectory tracking problem of underactuated autonomous underwater vehicles (AU-Vs) with input saturation. Our proposed model-free algorithm can realize high-level tracking control and stable learning by employing a novel actors-critics architecture, where a critic and multiple actors are learned to estimate the action-value function and deterministic policy, respectively. For the critic, Pseudo Averaged Q-learning, which is a simple extension to Q-learning, is proposed to calculate the target value, specifically, the action-value of next state is obtained by maximizing the average over the last multiple previous learned action-value estimates among all actors. As for the actors, deterministic policy gradient is applied to update the weights. The effectiveness and performance of the proposed Pseudo Averaged Q-learning based deterministic policy gradient (PAQ-DPG) algorithm is verified by implementation to an underactuated AUV. And the results demonstrate high-level tracking control accuracy and stability of learning of PAQ-DPG algorithm. Besides, under our proposed actors-critics framework, increasing the number of actors will further improve the performance.
引用
收藏
页码:4138 / 4143
页数:6
相关论文
共 50 条
  • [41] Neural network-based target tracking control of underactuated autonomous underwater vehicles with a prescribed performance
    Elhaki, Omid
    Shojaei, Khoshnam
    OCEAN ENGINEERING, 2018, 167 : 239 - 256
  • [42] Optimization of the Energy Consumption of Depth Tracking Control Based on Model Predictive Control for Autonomous Underwater Vehicles
    Yao, Feng
    Yang, Chao
    Zhang, Mingjun
    Wang, Yujia
    SENSORS, 2019, 19 (01)
  • [43] High-gain observer-based model predictive control for cross tracking of underactuated autonomous Underwater Vehicles: A comparative study
    Zhang, Guangjie
    Yan, Weisheng
    Gao, Jian
    Liu, Changxin
    INDIAN JOURNAL OF GEO-MARINE SCIENCES, 2017, 46 (12) : 2444 - 2451
  • [44] High-speed railway dynamic scheduling based on Q-learning method
    Han X.-C.
    Yu S.-P.
    Yuan Z.-M.
    Cheng L.-J.
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2021, 38 (10): : 1511 - 1521
  • [45] An Observer-Based Adaptive Neural Network Finite-Time Tracking Control for Autonomous Underwater Vehicles via Command Filters
    Guo, Jun
    Wang, Jun
    Bo, Yuming
    DRONES, 2023, 7 (10)
  • [46] Collaborative Q-learning Path Planning for Autonomous Robots based on Holonic Multi-Agent System
    Lamini, Chaymaa
    Fathi, Youssef
    Benhlima, Said
    2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2015,
  • [47] Navigation method for autonomous mobile robots based on ROS and multi-robot improved Q-learning
    Hamed, Oussama
    Hamlich, Mohamed
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024,
  • [48] A Shift Schedule to Optimize Pure Electric Vehicles Based on RL Using Q-Learning and Opt LHD
    Yu, Xin
    Zhao, Ling
    Zhang, Kun
    Guo, Hongqiang
    PROCESSES, 2022, 10 (10)
  • [49] Observation-Based Nonlinear Proportional-Derivative Control for Robust Trajectory Tracking for Autonomous Underwater Vehicles
    Guerrero, Jesus
    Torres, Jorge
    Creuze, Vincent
    Chemori, Ahmed
    IEEE JOURNAL OF OCEANIC ENGINEERING, 2020, 45 (04) : 1190 - 1202
  • [50] Fixed-time extended state observer-based trajectory tracking control for autonomous underwater vehicles
    Zheng, Jiaqi
    Song, Lei
    Liu, Lingya
    Yu, Wenbin
    Zhu, Shanying
    Wang, Yiyin
    Chen, Cailian
    ASIAN JOURNAL OF CONTROL, 2022, 24 (02) : 686 - 701