High-level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-learning

被引:7
作者
Shi, Wenjie [1 ]
Song, Shiji [1 ]
Wu, Cheng [1 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
来源
2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2018年
基金
美国国家科学基金会;
关键词
autonomous underwater vehicle; trajectory tracking; reinforcement learning; policy gradient; actor-critic;
D O I
10.1109/SMC.2018.00701
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we investigate the trajectory tracking problem of underactuated autonomous underwater vehicles (AU-Vs) with input saturation. Our proposed model-free algorithm can realize high-level tracking control and stable learning by employing a novel actors-critics architecture, where a critic and multiple actors are learned to estimate the action-value function and deterministic policy, respectively. For the critic, Pseudo Averaged Q-learning, which is a simple extension to Q-learning, is proposed to calculate the target value, specifically, the action-value of next state is obtained by maximizing the average over the last multiple previous learned action-value estimates among all actors. As for the actors, deterministic policy gradient is applied to update the weights. The effectiveness and performance of the proposed Pseudo Averaged Q-learning based deterministic policy gradient (PAQ-DPG) algorithm is verified by implementation to an underactuated AUV. And the results demonstrate high-level tracking control accuracy and stability of learning of PAQ-DPG algorithm. Besides, under our proposed actors-critics framework, increasing the number of actors will further improve the performance.
引用
收藏
页码:4138 / 4143
页数:6
相关论文
共 50 条
  • [31] Autonomous Overtaking Decision Making of Driverless Bus Based on Deep Q-learning Method
    Yu, Lingli
    Shao, Xuanya
    Yan, Xiaoxin
    2017 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE ROBIO 2017), 2017, : 2267 - 2272
  • [32] Dynamic Target Tracking of Autonomous Underwater Vehicle Based on Deep Reinforcement Learning
    Shi, Jiaxiang
    Fang, Jianer
    Zhang, Qizhong
    Wu, Qiuxuan
    Zhang, Botao
    Gao, Farong
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (10)
  • [33] Reinforcement Learning-Based Formation Control of Autonomous Underwater Vehicles with Model Interferences
    Cao, Wenqiang
    Yan, Jing
    Yang, Xian
    Luo, Xiaoyuan
    2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 4020 - 4025
  • [34] Adaptive reward shaping based reinforcement learning for docking control of autonomous underwater vehicles
    Chu, Shuguang
    Lin, Mingwei
    Li, Dejun
    Lin, Ri
    Xiao, Sa
    OCEAN ENGINEERING, 2025, 318
  • [35] Distributed cooperative H∞ optimal control of underactuated autonomous underwater vehicles based on reinforcement learning and prescribed performance
    Zhuo, Jiaoyang
    Tian, Xuehong
    Liu, Haitao
    OCEAN ENGINEERING, 2024, 312
  • [36] Motion Planning of Autonomous Vehicles Using Obstacle-Free Q-Learning Path Generator and Model Predictive Control
    Bazargani, Roozbeh
    Nadrabadi, Mahboobe Shakeri
    Firouzmand, Elnaz
    2021 9TH RSI INTERNATIONAL CONFERENCE ON ROBOTICS AND MECHATRONICS (ICROM), 2021, : 230 - 235
  • [38] Q-Learning based system for Path Planning with Unmanned Aerial Vehicles swarms in obstacle environments
    Puente-Castro, Alejandro
    Rivero, Daniel
    Pedrosa, Eurico
    Pereira, Artur
    Lau, Nuno
    Fernandez-Blanco, Enrique
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
  • [39] Joint localisation and tracking for autonomous underwater vehicle: a reinforcement learning-based approach
    Yan, Jing
    Li, Xin
    Luo, Xiaoyuan
    Gong, Yadi
    Guan, Xinping
    IET CONTROL THEORY AND APPLICATIONS, 2019, 13 (17) : 2856 - 2865
  • [40] Command-Filter-Based Region-Tracking Control for Autonomous Underwater Vehicles with Measurement Noise
    Lv, Tu
    Wang, Yujia
    Liu, Xing
    Zhang, Mingjun
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (11)