Self-learning robust optimal control for continuous-time nonlinear systems with mismatched disturbances

被引:72
作者
Yang, Xiong [1 ]
He, Haibo [2 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Adaptive dynamic programming; Neural network; Reinforcement learning; Robust optimal control; Mismatched disturbance; TRACKING CONTROL; FEEDBACK-CONTROL; DESIGN; STABILIZATION; ALGORITHM; EQUATION;
D O I
10.1016/j.neunet.2017.11.022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel adaptive dynamic programming(ADP)-based self-learning robust optimal control scheme for input-affine continuous-time nonlinear systems with mismatched disturbances. First, the stabilizing feedback controller for original nonlinear systems is designed by modifying the optimal control law of the auxiliary system. It is also demonstrated that this feedback controller can optimize a specified value function. Then, within the framework of ADP, a single critic network is constructed to solve the Hamilton-Jacobi-Bellman equation associated with the auxiliary system optimal control law. To update the critic network weights, an indicator function and a concurrent learning technique are employed. By using the proposed update law for the critic network, the restrictive conditions including the initial admissible control and the persistence of excitation condition are relaxed. Moreover, the stability of the closed-loop auxiliary system is guaranteed in the sense that all the signals are uniformly ultimately bounded. Finally, the applicability of the developed control strategy is illustrated through simulations for an unstable nonlinear plant and a power system. (c) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:19 / 30
页数:12
相关论文
共 51 条
[1]   Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].
Abu-Khalaf, M ;
Lewis, FL .
AUTOMATICA, 2005, 41 (05) :779-791
[2]  
[Anonymous], 1999, Neural network control of robot manipulators and nonlinear systems
[3]  
[Anonymous], IEEE T NEURAL NETWOR
[4]  
[Anonymous], 1996, Neuro-dynamic programming
[5]  
[Anonymous], 2013, Optimal adaptive control and differential games by reinforcement learning principles
[6]   Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation [J].
Beard, RW ;
Saridis, GN ;
Wen, JT .
AUTOMATICA, 1997, 33 (12) :2159-2177
[7]   Adaptive dynamic programming and optimal control of nonlinear nonaffine systems [J].
Bian, Tao ;
Jiang, Yu ;
Jiang, Zhong-Ping .
AUTOMATICA, 2014, 50 (10) :2624-2632
[8]   Concurrent learning adaptive control of linear systems with exponentially convergent bounds [J].
Chowdhary, Girish ;
Yucelen, Tansel ;
Muehlegg, Maximillian ;
Johnson, Eric N. .
INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2013, 27 (04) :280-301
[9]   CONTINUOUS STATE FEEDBACK GUARANTEEING UNIFORM ULTIMATE BOUNDEDNESS FOR UNCERTAIN DYNAMIC-SYSTEMS [J].
CORLESS, MJ ;
LEITMANN, G .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1981, 26 (05) :1139-1144
[10]   Optimal Control of Affine Nonlinear Continuous-time Systems Using an Online Hamilton-Jacobi-Isaacs Formulation [J].
Dierks, T. ;
Jagannathan, S. .
49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, :3048-3053