Distributed cooperative H∞ optimal control of underactuated autonomous underwater vehicles based on reinforcement learning and prescribed performance

被引:2
作者
Zhuo, Jiaoyang [1 ,2 ]
Tian, Xuehong [1 ,2 ,3 ]
Liu, Haitao [1 ,2 ,3 ]
机构
[1] Guangdong Ocean Univ, Sch Mech Engn, Zhanjiang 524088, Peoples R China
[2] Guangdong Ocean Univ, Shenzhen Inst, Shenzhen 518120, Peoples R China
[3] Guangdong Engn Technol Res Ctr Ocean Equipment & M, Zhanjiang 524088, Peoples R China
关键词
Underactuated autonomous underwater vehicle; Optimal control; Trajectory tracking; Prescribed performance control; Reinforcement learning; H-infinity control; TRACKING CONTROL;
D O I
10.1016/j.oceaneng.2024.119323
中图分类号
U6 [水路运输]; P75 [海洋工程];
学科分类号
0814 ; 081505 ; 0824 ; 082401 ;
摘要
To balance energy resources and control performance, an H-infinity optimal control method based on prescribed performance control (PPC) and a reinforcement learning (RL) algorithm with actor-critic mechanisms for distributed cooperative control is proposed for multiple five-degree-of-freedom underactuated autonomous underwater vehicles (AUVs) with unknown uncertainty disturbances. First, an optimal control strategy combining PPC is proposed to achieve optimal control of a cooperative system while ensuring that the error always stays within the prescribed boundary. Second, to suppress uncertainty disturbances, H-infinity control methods are proposed to improve the robustness of the system. Achieving H-infinity optimal control requires solving the Hamilton-Jacobi-Bellman (HJB) equation, but the inherent nonlinearity of the HJB equation makes it difficult to solve. Therefore, an adaptive approximation strategy incorporating an online RL method with an actor-critic architecture is used to solve the above problem, which dynamically adjusts the control strategy to ensure system control performance through the environment assessment-feedback approach. In addition, a distributed adaptive state observer is proposed to obtain information about the virtual leader for each agent so that leader information can be accurately obtained, even if the agent communicates only with neighboring agents. Using the above control method, all errors of the formation system are proven to be uniform and ultimately bounded according to Lyapunov's stability theorem. Finally, a numerical simulation is performed to further demonstrate the effectiveness and feasibility of the proposed method.
引用
收藏
页数:16
相关论文
共 49 条
[11]   A robust neural network approximation-based prescribed performance output-feedback controller for autonomous underwater vehicles with actuators saturation [J].
Elhaki, Omid ;
Shojaei, Khoshnam .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 88
[12]   Neural adaptive output feedback tracking control of underactuated AUVs [J].
Fang, Kai ;
Fang, Haolin ;
Zhang, Jiawen ;
Yao, Jiaqi ;
Li, Jiawang .
OCEAN ENGINEERING, 2021, 234
[13]   AUV position tracking and trajectory control based on fast-deployed deep reinforcement learning method [J].
Fang, Yuan ;
Huang, Zhenwei ;
Pu, Jinyun ;
Zhang, Jinsong .
OCEAN ENGINEERING, 2022, 245
[14]   Control for systems with prescribed performance guarantees: An alternative interval theory-based approach [J].
Guo, Zongyi ;
Henry, David ;
Guo, Jianguo ;
Wang, Zheng ;
Cieslak, Jerome ;
Chang, Jing .
AUTOMATICA, 2022, 146
[15]   Multi-ASV Coordinated Tracking With Unknown Dynamics and Input Underactuation via Model-Reference Reinforcement Learning Control [J].
Hu, Wenbo ;
Chen, Fei ;
Xiang, Linying ;
Chen, Guanrong .
IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (10) :6588-6597
[16]   Time-Varying Optimal Formation Control for Second-Order Multiagent Systems Based on Neural Network Observer and Reinforcement Learning [J].
Lan, Jie ;
Liu, Yan-Jun ;
Yu, Dengxiu ;
Wen, Guoxing ;
Tong, Shaocheng ;
Liu, Lei .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (03) :3144-3155
[17]   Observer-Based Adaptive Optimized Control for Stochastic Nonlinear Systems With Input and State Constraints [J].
Li, Yongming ;
Zhang, Jiaxin ;
Liu, Wei ;
Tong, Shaocheng .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) :7791-7805
[18]   Observer-Based Neuro-Adaptive Optimized Control of Strict-Feedback Nonlinear Systems With State Constraints [J].
Li, Yongming ;
Liu, Yanjun ;
Tong, Shaocheng .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (07) :3131-3145
[19]   Adaptive reinforcement learning fault-tolerant control for AUVs With thruster faults based on the integral extended state observer [J].
Li, Zhifu ;
Wang, Ming ;
Ma, Ge ;
Zou, Tao .
OCEAN ENGINEERING, 2023, 271
[20]   Adaptive optimal trajectory tracking control of AUVs based on reinforcement learning [J].
Li, Zhifu ;
Wang, Ming ;
Ma, Ge .
ISA TRANSACTIONS, 2023, 137 :122-132