Model-Free Adaptive Control for Unknown Nonlinear Zero-Sum Differential Game

被引:96
作者
Zhong, Xiangnan [1 ]
He, Haibo [1 ]
Wang, Ding [2 ]
Ni, Zhen [3 ]
机构
[1] Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA
[2] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[3] South Dakota State Univ, Dept Elect Engn & Comp Sci, Brookings, SD 57007 USA
基金
北京市自然科学基金; 美国国家科学基金会; 中国国家自然科学基金;
关键词
Adaptive dynamic programming (ADP); globalized dual heuristic dynamic programming (GDHP); model-free; neural networks; zero-sum game; H-INFINITY CONTROL; STATE-FEEDBACK CONTROL; CONTROL SCHEME; SYSTEMS; APPROXIMATION; TRACKING; REPRESENTATION; ALGORITHM; EQUATION;
D O I
10.1109/TCYB.2017.2712617
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we present a new model-free globalized dual heuristic dynamic programming (GDHP) approach for the discrete-time nonlinear zero-sum game problems. First, the online learning algorithm is proposed based on the GDHP method to solve the Hamilton-Jacobi-Isaacs equation associated with H-infinity optimal regulation control problem. By setting backward one step of the definition of performance index, the requirement of system dynamics, or an identifier is relaxed in the proposed method. Then, three neural networks are established to approximate the optimal saddle point feedback control law, the disturbance law, and the performance index, respectively. The explicit updating rules for these three neural networks are provided based on the data generated during the online learning along the system trajectories. The stability analysis in terms of the neural network approximation errors is discussed based on the Lyapunov approach. Finally, two simulation examples are provided to show the effectiveness of the proposed method.
引用
收藏
页码:1633 / 1646
页数:14
相关论文
共 58 条
[1]   Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation [J].
Abu-Khalaf, Murad ;
Lewis, Frank L. ;
Huang, Jie .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) :1989-1995
[2]   Neurodynamic programming and zero-sum games for constrained control systems [J].
Abu-Khalaf, Murad ;
Lewis, Frank L. ;
Huang, Jie .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (07) :1243-1252
[3]  
[Anonymous], 2012, REINFORCEMENT LEARNI
[4]  
[Anonymous], 2014, Matrix analysis
[5]  
[Anonymous], 2015, Reinforcement Learning: An Introduction
[6]  
[Anonymous], 2007, Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)
[7]   Successive Galerkin approximation algorithms for nonlinear optimal and robust control [J].
Beard, RW ;
McLain, TW .
INTERNATIONAL JOURNAL OF CONTROL, 1998, 71 (05) :717-743
[8]  
Bernhard P., 1995, H-optimal control and related minimax design problems, V2nd
[9]   Optimal Control of Affine Nonlinear Continuous-time Systems Using an Online Hamilton-Jacobi-Isaacs Formulation [J].
Dierks, T. ;
Jagannathan, S. .
49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, :3048-3053
[10]   STATE-SPACE SOLUTIONS TO STANDARD H-2 AND H-INFINITY CONTROL-PROBLEMS [J].
DOYLE, JC ;
GLOVER, K ;
KHARGONEKAR, PP ;
FRANCIS, BA .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1989, 34 (08) :831-847