Reinforcement learning for robust stabilization of nonlinear systems with asymmetric saturating actuators?

被引:17
作者
Yang, Xiong [1 ]
Zhou, Yingjiang [2 ,3 ]
Gao, Zhongke [1 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Coll Automation, Nanjing 210023, Peoples R China
[3] Nanjing Univ Posts & Telecommun, Coll Artificial Intelligence, Nanjing 210023, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive dynamic programming; Neural network control; Robust stabilization; Reinforcement learning; Saturating actuator; H-INFINITY CONTROL; EVENT-TRIGGERED CONTROL; CONTINUOUS-TIME SYSTEMS; CONSTRAINED-INPUT; TRACKING CONTROL; ITERATION; DESIGN;
D O I
10.1016/j.neunet.2022.11.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We study the robust stabilization problem of a class of nonlinear systems with asymmetric saturating actuators and mismatched disturbances. Initially, we convert such a robust stabilization problem into a nonlinear-constrained optimal control problem by constructing a discounted cost function for the auxiliary system. Then, for the purpose of solving the nonlinear-constrained optimal control problem, we develop a simultaneous policy iteration (PI) in the reinforcement learning framework. The implementation of the simultaneous PI relies on an actor-critic architecture, which employs actor and critic neural networks (NNs) to separately approximate the control policy and the value function. To determine the actor and critic NNs' weights, we use the approach of weighted residuals together with the typical Monte-Carlo integration technique. Finally, we perform simulations of two nonlinear plants to validate the established theoretical claims.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页码:132 / 141
页数:10
相关论文
共 44 条
[1]   Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].
Abu-Khalaf, M ;
Lewis, FL .
AUTOMATICA, 2005, 41 (05) :779-791
[2]  
Abu-Khalaf M., 2006, Nonlinear H2/Hoo Constrained Feedback Control: A Practical Design Approach Using Neural Networks
[3]  
Basar T., 1995, Hoo Optimal Control and Related Minimax Design Problems
[4]  
Ben-Israel A., 2003, Generalized Inverses: Theory and Applications, Vsecond
[5]  
FINLAYSON BA, 1972, METHOD WEIGHTED RESI
[6]   Data-Driven Cooperative Output Regulation of Multi-Agent Systems via Robust Adaptive Dynamic Programming [J].
Gao, Weinan ;
Jiang, Yu ;
Davari, Masoud .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2019, 66 (03) :447-451
[7]   Learning-Based Adaptive Optimal Tracking Control of Strict-Feedback Nonlinear Systems [J].
Gao, Weinan ;
Jiang, Zhong-Ping .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) :2614-2624
[8]   H∞ control with constrained input for completely unknown nonlinear systems using data-driven reinforcement learning method [J].
Jiang, He ;
Zhang, Huaguang ;
Luo, Yanhong ;
Cui, Xiaohong .
NEUROCOMPUTING, 2017, 237 :226-234
[9]  
Jiang Y., 2017, ROBUST ADAPTIVE DYNA
[10]  
Khalil HK., 2002, Nonlinear Systems, V3