High-level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-learning

被引：7

作者：

Shi, Wenjie ^{[1
]}

Song, Shiji ^{[1
]}

Wu, Cheng ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China

来源：

2018 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC) | 2018年

基金：

美国国家科学基金会;

关键词：

autonomous underwater vehicle; trajectory tracking; reinforcement learning; policy gradient; actor-critic;

D O I：

10.1109/SMC.2018.00701

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we investigate the trajectory tracking problem of underactuated autonomous underwater vehicles (AU-Vs) with input saturation. Our proposed model-free algorithm can realize high-level tracking control and stable learning by employing a novel actors-critics architecture, where a critic and multiple actors are learned to estimate the action-value function and deterministic policy, respectively. For the critic, Pseudo Averaged Q-learning, which is a simple extension to Q-learning, is proposed to calculate the target value, specifically, the action-value of next state is obtained by maximizing the average over the last multiple previous learned action-value estimates among all actors. As for the actors, deterministic policy gradient is applied to update the weights. The effectiveness and performance of the proposed Pseudo Averaged Q-learning based deterministic policy gradient (PAQ-DPG) algorithm is verified by implementation to an underactuated AUV. And the results demonstrate high-level tracking control accuracy and stability of learning of PAQ-DPG algorithm. Besides, under our proposed actors-critics framework, increasing the number of actors will further improve the performance.

引用

页码：4138 / 4143

页数：6

共 50 条

[41] Neural network-based target tracking control of underactuated autonomous underwater vehicles with a prescribed performance
Elhaki, Omid
Shojaei, Khoshnam
OCEAN ENGINEERING, 2018, 167 : 239 - 256
[42] Optimization of the Energy Consumption of Depth Tracking Control Based on Model Predictive Control for Autonomous Underwater Vehicles
Yao, Feng
Yang, Chao
Zhang, Mingjun
Wang, Yujia
SENSORS, 2019, 19 (01)
[43] High-gain observer-based model predictive control for cross tracking of underactuated autonomous Underwater Vehicles: A comparative study
Zhang, Guangjie
Yan, Weisheng
Gao, Jian
Liu, Changxin
INDIAN JOURNAL OF GEO-MARINE SCIENCES, 2017, 46 (12) : 2444 - 2451
[44] High-speed railway dynamic scheduling based on Q-learning method
Han X.-C.
Yu S.-P.
Yuan Z.-M.
Cheng L.-J.
Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2021, 38 (10): : 1511 - 1521
[45] An Observer-Based Adaptive Neural Network Finite-Time Tracking Control for Autonomous Underwater Vehicles via Command Filters
Guo, Jun
Wang, Jun
Bo, Yuming
DRONES, 2023, 7 (10)
[46] Collaborative Q-learning Path Planning for Autonomous Robots based on Holonic Multi-Agent System
Lamini, Chaymaa
Fathi, Youssef
Benhlima, Said
2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2015,
[47] Navigation method for autonomous mobile robots based on ROS and multi-robot improved Q-learning
Hamed, Oussama
Hamlich, Mohamed
PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024,
[48] A Shift Schedule to Optimize Pure Electric Vehicles Based on RL Using Q-Learning and Opt LHD
Yu, Xin
Zhao, Ling
Zhang, Kun
Guo, Hongqiang
PROCESSES, 2022, 10 (10)
[49] Observation-Based Nonlinear Proportional-Derivative Control for Robust Trajectory Tracking for Autonomous Underwater Vehicles
Guerrero, Jesus
Torres, Jorge
Creuze, Vincent
Chemori, Ahmed
IEEE JOURNAL OF OCEANIC ENGINEERING, 2020, 45 (04) : 1190 - 1202
[50] Fixed-time extended state observer-based trajectory tracking control for autonomous underwater vehicles
Zheng, Jiaqi
Song, Lei
Liu, Lingya
Yu, Wenbin
Zhu, Shanying
Wang, Yiyin
Chen, Cailian
ASIAN JOURNAL OF CONTROL, 2022, 24 (02) : 686 - 701

← 1 2 3 4 5 →