High-level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-learning

被引：7

作者：

Shi, Wenjie ^{[1
]}

Song, Shiji ^{[1
]}

Wu, Cheng ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2018年

基金：

美国国家科学基金会;

关键词：

autonomous underwater vehicle; trajectory tracking; reinforcement learning; policy gradient; actor-critic;

D O I：

10.1109/SMC.2018.00701

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we investigate the trajectory tracking problem of underactuated autonomous underwater vehicles (AU-Vs) with input saturation. Our proposed model-free algorithm can realize high-level tracking control and stable learning by employing a novel actors-critics architecture, where a critic and multiple actors are learned to estimate the action-value function and deterministic policy, respectively. For the critic, Pseudo Averaged Q-learning, which is a simple extension to Q-learning, is proposed to calculate the target value, specifically, the action-value of next state is obtained by maximizing the average over the last multiple previous learned action-value estimates among all actors. As for the actors, deterministic policy gradient is applied to update the weights. The effectiveness and performance of the proposed Pseudo Averaged Q-learning based deterministic policy gradient (PAQ-DPG) algorithm is verified by implementation to an underactuated AUV. And the results demonstrate high-level tracking control accuracy and stability of learning of PAQ-DPG algorithm. Besides, under our proposed actors-critics framework, increasing the number of actors will further improve the performance.

引用

页码：4138 / 4143

页数：6

共 50 条

[31] Autonomous Overtaking Decision Making of Driverless Bus Based on Deep Q-learning Method
Yu, Lingli
Shao, Xuanya
Yan, Xiaoxin
2017 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (IEEE ROBIO 2017), 2017, : 2267 - 2272
[32] Dynamic Target Tracking of Autonomous Underwater Vehicle Based on Deep Reinforcement Learning
Shi, Jiaxiang
Fang, Jianer
Zhang, Qizhong
Wu, Qiuxuan
Zhang, Botao
Gao, Farong
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2022, 10 (10)
[33] Reinforcement Learning-Based Formation Control of Autonomous Underwater Vehicles with Model Interferences
Cao, Wenqiang
Yan, Jing
Yang, Xian
Luo, Xiaoyuan
2021 PROCEEDINGS OF THE 40TH CHINESE CONTROL CONFERENCE (CCC), 2021, : 4020 - 4025
[34] Adaptive reward shaping based reinforcement learning for docking control of autonomous underwater vehicles
Chu, Shuguang
Lin, Mingwei
Li, Dejun
Lin, Ri
Xiao, Sa
OCEAN ENGINEERING, 2025, 318
[35] Distributed cooperative H∞ optimal control of underactuated autonomous underwater vehicles based on reinforcement learning and prescribed performance
Zhuo, Jiaoyang
Tian, Xuehong
Liu, Haitao
OCEAN ENGINEERING, 2024, 312
[36] Motion Planning of Autonomous Vehicles Using Obstacle-Free Q-Learning Path Generator and Model Predictive Control
Bazargani, Roozbeh
Nadrabadi, Mahboobe Shakeri
Firouzmand, Elnaz
2021 9TH RSI INTERNATIONAL CONFERENCE ON ROBOTICS AND MECHATRONICS (ICROM), 2021, : 230 - 235
[37] A velocity adaptive steering control strategy of autonomous vehicle based on double deep Q-learning network with varied agents
Lin, Xinyou (linxinyoou@fzu.edu.cn), 2025, 139
[38] Q-Learning based system for Path Planning with Unmanned Aerial Vehicles swarms in obstacle environments
Puente-Castro, Alejandro
Rivero, Daniel
Pedrosa, Eurico
Pereira, Artur
Lau, Nuno
Fernandez-Blanco, Enrique
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
[39] Joint localisation and tracking for autonomous underwater vehicle: a reinforcement learning-based approach
Yan, Jing
Li, Xin
Luo, Xiaoyuan
Gong, Yadi
Guan, Xinping
IET CONTROL THEORY AND APPLICATIONS, 2019, 13 (17) : 2856 - 2865
[40] Command-Filter-Based Region-Tracking Control for Autonomous Underwater Vehicles with Measurement Noise
Lv, Tu
Wang, Yujia
Liu, Xing
Zhang, Mingjun
JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (11)

← 1 2 3 4 5 →