Reinforcement learning-based finite-time tracking control of an unknown unmanned surface vehicle with input constraints

被引:69
作者
Wang, Ning [1 ,3 ]
Gao, Ying [2 ]
Yang, Chen [2 ]
Zhang, Xuefeng [2 ]
机构
[1] Dalian Maritime Univ, Sch Marine Engn, Dalian 116026, Peoples R China
[2] Dalian Maritime Univ, Sch Marine Elect Engn, Dalian 116026, Peoples R China
[3] Harbin Engn Univ, Sch Shipbldg Engn, Harbin 150001, Peoples R China
关键词
Reinforcement learning-based finite-time; control; Optimal tracking control; Unknown system dynamics; Input constraints; Unmanned surface vehicle; H-INFINITY CONTROL; NONLINEAR-SYSTEMS; CONTROL DESIGN;
D O I
10.1016/j.neucom.2021.04.133
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, subject to completely unknown system dynamics and input constraints, a reinforcement learning-based finite-time trajectory tracking control (RLFTC) scheme is innovatively created for an unmanned surface vehicle (USV) by combining actor-critic reinforcement learning (RL) mechanism with finite-time control technique. Unlike previous RL-based tracking which requires infinite-time convergence thereby rather sensitive to complex unknowns, an actor-critic finite-time control structure is created by employing adaptive neural network identifiers to recursively update actor and critic, such that learning-based robustness can be sufficiently enhanced. Moreover, deduced from the Bellman error formulation, the proposed RLFTC is directly optimized in a finite-time manner. Theoretical analysis eventually shows that the proposed RLFTC scheme can ensure semi-global practical finite-time stability (SGPFS) for a closed-loop USV system and tracking errors converge to an arbitrarily small neighborhood of the origin in a finite time, subject to optimal cost. Both mathematical simulation and virtual-reality experiments demonstrate remarkable effectiveness and superiority of the proposed RLFTC scheme. (c) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页码:26 / 37
页数:12
相关论文
共 38 条
[1]   Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming [J].
Bertsekas, Dimitri P. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) :500-509
[2]   Adaptive Neural Control of Underactuated Surface Vessels With Prescribed Performance Guarantees [J].
Dai, Shi-Lu ;
He, Shude ;
Wang, Min ;
Yuan, Chengzhi .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (12) :3686-3698
[3]   A three-network architecture for on-line learning and optimization based on adaptive dynamic programming [J].
He, Haibo ;
Ni, Zhen ;
Fu, Jian .
NEUROCOMPUTING, 2012, 78 (01) :3-13
[4]   Finite-time control for robot manipulators [J].
Hong, YG ;
Xu, YS ;
Huang, J .
SYSTEMS & CONTROL LETTERS, 2002, 46 (04) :243-253
[5]   Local Capacity H∞ Control for Production Networks of Autonomous Work Systems With Time-Varying Delays [J].
Karimi, Hamid Reza ;
Duffie, Neil A. ;
Dashkovskiy, Sergey .
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2010, 7 (04) :849-857
[6]   Off-Policy Reinforcement Learning: Optimal Operational Control for Two-Time-Scale Industrial Processes [J].
Li, Jinna ;
Kiumarsi, Bahare ;
Chai, Tianyou ;
Lewis, Frank L. ;
Fan, Jialu .
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (12) :4547-4558
[7]   Finite-Time Adaptive Fuzzy Output Feedback Dynamic Surface Control for MIMO Nonstrict Feedback Systems [J].
Li, Yongming ;
Li, Kewen ;
Tong, Shaocheng .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2019, 27 (01) :96-110
[8]   Event-triggered reinforcement learning H∞ control design for constrained-input nonlinear systems subject to actuator failures [J].
Liang, Yuling ;
Zhang, Huaguang ;
Duan, Jie ;
Sun, Shaoxin .
INFORMATION SCIENCES, 2021, 543 :273-295
[9]   Event-Triggered Optimal Control With Performance Guarantees Using Adaptive Dynamic Programming [J].
Luo, Biao ;
Yang, Yin ;
Liu, Derong ;
Wu, Huai-Ning .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (01) :76-88
[10]   Output Tracking Control Based on Adaptive Dynamic Programming With Multistep Policy Evaluation [J].
Luo, Biao ;
Liu, Derong ;
Huang, Tingwen ;
Liu, Jiangjiang .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2019, 49 (10) :2155-2165