High-level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-learning

被引：7

作者：

Shi, Wenjie ^{[1
]}

Song, Shiji ^{[1
]}

Wu, Cheng ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2018年

基金：

美国国家科学基金会;

关键词：

autonomous underwater vehicle; trajectory tracking; reinforcement learning; policy gradient; actor-critic;

D O I：

10.1109/SMC.2018.00701

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we investigate the trajectory tracking problem of underactuated autonomous underwater vehicles (AU-Vs) with input saturation. Our proposed model-free algorithm can realize high-level tracking control and stable learning by employing a novel actors-critics architecture, where a critic and multiple actors are learned to estimate the action-value function and deterministic policy, respectively. For the critic, Pseudo Averaged Q-learning, which is a simple extension to Q-learning, is proposed to calculate the target value, specifically, the action-value of next state is obtained by maximizing the average over the last multiple previous learned action-value estimates among all actors. As for the actors, deterministic policy gradient is applied to update the weights. The effectiveness and performance of the proposed Pseudo Averaged Q-learning based deterministic policy gradient (PAQ-DPG) algorithm is verified by implementation to an underactuated AUV. And the results demonstrate high-level tracking control accuracy and stability of learning of PAQ-DPG algorithm. Besides, under our proposed actors-critics framework, increasing the number of actors will further improve the performance.

引用

页码：4138 / 4143

页数：6

共 50 条

[1] Multi Pseudo Q-Learning-Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles
Shi, Wenjie
Song, Shiji
Wu, Cheng
Chen, C. L. Philip
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (12) : 3534 - 3546
[2] Distributed Path Tracking for Autonomous Underwater Vehicles Based on Pseudo Position Feedback
Gao, Huanli
Li, Wei
Cai, He
Gu, Zekai
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (10)
[3] Heading Control for an Autonomous Underwater Vehicle Using ELM-based Q-Learning
Wang, Dianrui
Shen, Yue
Sha, Qixin
Li, Guangliang
Jiang, Jingtao
Yan, Tianhong
Wan, Junhe
He, Bo
2017 IEEE UNDERWATER TECHNOLOGY (UT), 2017,
[4] Cross-domain Monitoring of Underwater Targets Based on Q-learning for Heterogeneous Unmanned Vehicles
Lin, Jingsheng
Yan, Jing
Yang, Xian
Luo, Xiaoyuan
2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 4299 - 4304
[5] Design of an Active Vision System for High-Level Isolation Units through Q-Learning
Ruiz, Andrea Gil
Victores, Juan G.
Lukawski, Bartek
Balaguer, Carlos
APPLIED SCIENCES-BASEL, 2020, 10 (17):
[6] Autonomous Lane Keeping Based on Approximate Q-learning
Lee, Jonggu
Kim, Taewan
Kim, H. Jin
2017 14TH INTERNATIONAL CONFERENCE ON UBIQUITOUS ROBOTS AND AMBIENT INTELLIGENCE (URAI), 2017, : 402 - 405
[7] MPC-Based Motion Planning and Tracking Control for Autonomous Underwater Vehicles
Huang, Zhihao
Sun, Bing
Zhang, Wei
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2023, : 2077 - 2082
[8] Trajectory tracking control of vectored thruster autonomous underwater vehicles based on deep reinforcement learning
Liu, Tao
Zhao, Jintao
Hu, Yuli
Huang, Junhao
SHIPS AND OFFSHORE STRUCTURES, 2024,
[9] Robust MPC-based trajectory tracking of autonomous underwater vehicles with model uncertainty
Yan, Zheping
Yan, Jinyu
Cai, Sijia
Yu, Yuyang
Wu, Yifan
OCEAN ENGINEERING, 2023, 286
[10] Trajectory tracking control based on a virtual closed-loop system for autonomous underwater vehicles
Liu, Xing
Zhang, Mingjun
Chen, Zeyu
INTERNATIONAL JOURNAL OF CONTROL, 2020, 93 (12) : 2789 - 2803

← 1 2 3 4 5 →