Adaptive Optimal Control for a Class of Nonlinear Systems: The Online Policy Iteration Approach

被引:161
作者
He, Shuping [1 ,2 ]
Fang, Haiyang [1 ]
Zhang, Maoguang [1 ]
Liu, Fei [3 ]
Ding, Zhengtao [4 ]
机构
[1] Anhui Univ, Sch Elect Engn & Automat, Hefei 230601, Peoples R China
[2] Anhui Univ, Inst Phys Sci & Informat Technol, Hefei 230601, Peoples R China
[3] Jiangnan Univ, Inst Automat, Minist Educ, Key Lab Adv Proc Control Light Ind, Wuxi 214122, Jiangsu, Peoples R China
[4] Univ Manchester, Sch Elect & Elect Engn, Manchester M13 9PL, Lancs, England
基金
中国国家自然科学基金;
关键词
Adaptive optimal control; algebraic Riccati equation (ARE); linear differential inclusion (LDI); nonlinear systems; policy iteration (PI); STABILITY ANALYSIS; LINEAR-SYSTEMS; CONTROL DESIGN; TIME; ALGORITHM;
D O I
10.1109/TNNLS.2019.2905715
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper studies the online adaptive optimal controller design for a class of nonlinear systems through a novel policy iteration (PI) algorithm. By using the technique of neural network linear differential inclusion (LDI) to linearize the nonlinear terms in each iteration, the optimal law for controller design can be solved through the relevant algebraic Riccati equation (ARE) without using the system internal parameters. Based on PI approach, the adaptive optimal control algorithm is developed with the online linearization and the two-step iteration, i.e., policy evaluation and policy improvement. The convergence of the proposed PI algorithm is also proved. Finally, two numerical examples are given to illustrate the effectiveness and applicability of the proposed method.
引用
收藏
页码:549 / 558
页数:10
相关论文
共 41 条
[1]  
[Anonymous], 2012, Optimal control theory: an introduction
[2]   Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation [J].
Beard, RW ;
Saridis, GN ;
Wen, JT .
AUTOMATICA, 1997, 33 (12) :2159-2177
[3]   Dynamic programming for constrained optimal control of discrete-time linear hybrid systems [J].
Borrelli, F ;
Baotic, M ;
Bemporad, A ;
Morari, M .
AUTOMATICA, 2005, 41 (10) :1709-1721
[4]  
Boyd S, 1994, LINEAR MATRIX INEQUA
[5]   Event-Triggered Adaptive Dynamic Programming for Continuous-Time Systems With Control Constraints [J].
Dong, Lu ;
Zhong, Xiangnan ;
Sun, Changyin ;
He, Haibo .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (08) :1941-1952
[6]   Policy Approximation in Policy Iteration Approximate Dynamic Programming for Discrete-Time Nonlinear Systems [J].
Guo, Wentao ;
Si, Jennie ;
Liu, Feng ;
Mei, Shengwei .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) :2794-2807
[7]   Online adaptive optimal control for continuous-time Markov jump linear systems using a novel policy iteration algorithm [J].
He, Shuping ;
Song, Jun ;
Ding, Zhengtao ;
Liu, Fei .
IET CONTROL THEORY AND APPLICATIONS, 2015, 9 (10) :1536-1543
[8]   Artificial neural networks: A tutorial [J].
Jain, AK ;
Mao, JC ;
Mohiuddin, KM .
COMPUTER, 1996, 29 (03) :31-+
[9]   Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics [J].
Jiang, Yu ;
Jiang, Zhong-Ping .
AUTOMATICA, 2012, 48 (10) :2699-2704
[10]   ON AN ITERATIVE TECHNIQUE FOR RICCATI EQUATION COMPUTATIONS [J].
KLEINMAN, DL .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1968, AC13 (01) :114-+