Adaptive Interleaved Reinforcement Learning: Robust Stability of Affine Nonlinear Systems With Unknown Uncertainty

被引:72
作者
Li, Jinna [1 ]
Ding, Jinliang [2 ]
Chai, Tianyou [2 ]
Lewis, Frank L. [3 ]
Jagannathan, Sarangapani [4 ]
机构
[1] Liaoning Shihua Univ, Sch Informat & Control Engn, Fushun 113001, Peoples R China
[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China
[3] Univ Texas Arlington, UTA Res Inst, Arlington, TX 76118 USA
[4] Missouri Univ Sci & Technol, Dept Elect & Comp Engn, Rolla, MO 65409 USA
基金
中国国家自然科学基金;
关键词
Uncertainty; Nonlinear systems; Robust control; Optimal control; Adaptive systems; Robust stability; Linear systems; Interleaved reinforcement learning; neural networks (NNs); robust control; uncertain systems; STABILIZATION; DESIGN; ALGORITHM; EQUATION;
D O I
10.1109/TNNLS.2020.3027653
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article investigates adaptive robust controller design for discrete-time (DT) affine nonlinear systems using an adaptive dynamic programming. A novel adaptive interleaved reinforcement learning algorithm is developed for finding a robust controller of DT affine nonlinear systems subject to matched or unmatched uncertainties. To this end, the robust control problem is converted into the optimal control problem for nominal systems by selecting an appropriate utility function. The performance evaluation and control policy update combined with neural networks approximation are alternately implemented at each time step for solving a simplified Hamilton-Jacobi-Bellman (HJB) equation such that the uniformly ultimately bounded (UUB) stability of DT affine nonlinear systems can be guaranteed, allowing for all realization of unknown bounded uncertainties. The rigorously theoretical proofs of convergence of the proposed interleaved RL algorithm and UUB stability of uncertain systems are provided. Simulation results are given to verify the effectiveness of the proposed method.
引用
收藏
页码:270 / 280
页数:11
相关论文
共 29 条
[1]  
[Anonymous], 2006, Neural Network Control of Nonlinear Discrete-Time Systems
[2]   Generalized Hamilton-Jacobi-Blellman formulation-based neural network control of affine nonlinear discrete-time systems [J].
Chen, Zheng ;
Jagannathan, Sarangapani .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (01) :90-106
[3]   Adaptive Actor-Critic Design-Based Integral Sliding-Mode Control for Partially Unknown Nonlinear Systems With Input Disturbances [J].
Fan, Quan-Yong ;
Yang, Guang-Hong .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (01) :165-177
[4]   Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming [J].
Gao, Weinan ;
Jiang, Yu ;
Jiang, Zhong-Ping ;
Chai, Tianyou .
AUTOMATICA, 2016, 72 :37-45
[5]   Data-Driven Flotation Industrial Process Operational Optimal Control Based on Reinforcement Learning [J].
Jiang, Yi ;
Fan, Jialu ;
Chai, Tianyou ;
Li, Jinna ;
Lewis, Frank L. .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2018, 14 (05) :1974-1989
[6]   Robust Adaptive Dynamic Programming and Feedback Stabilization of Nonlinear Systems [J].
Jiang, Yu ;
Jiang, Zhong-Ping .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (05) :882-893
[7]   Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control [J].
Lewis, Frank L. ;
Vrabie, Draguna .
IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2009, 9 (03) :32-50
[8]   Off-Policy Interleaved Q-Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems [J].
Li, Jinna ;
Chai, Tianyou ;
Lewis, Frank L. ;
Ding, Zhengtao ;
Jiang, Yi .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) :1308-1320
[9]   Off-Policy Reinforcement Learning for Synchronization in Multiagent Graphical Games [J].
Li, Jinna ;
Modares, Hamidreza ;
Chai, Tianyou ;
Lewis, Frank L. ;
Xie, Lihua .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (10) :2434-2445
[10]   An optimal control approach to robust control design [J].
Lin, F .
INTERNATIONAL JOURNAL OF CONTROL, 2000, 73 (03) :177-186