Value Iteration-Based H∞ Controller Design for Continuous-Time Nonlinear Systems Subject to Input Constraints

被引:29
作者
Zhang, Huaguang [1 ,2 ]
Xiao, Geyang [2 ]
Liu, Yang [2 ]
Liu, Lei [3 ]
机构
[1] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110004, Peoples R China
[2] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Peoples R China
[3] Liaoning Univ Technol, Coll Sci, Jinzhou 121001, Peoples R China
来源
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2020年 / 50卷 / 11期
基金
中国国家自然科学基金; 国家高技术研究发展计划(863计划);
关键词
Approximate dynamic programming; H-infinity control; reinforcement learning (RL); value iteration (VI); ZERO-SUM GAMES; STATE-FEEDBACK CONTROL; POLICY UPDATE ALGORITHM; QUADRATIC GAMES; LINEAR-SYSTEMS; EQUATION;
D O I
10.1109/TSMC.2018.2853091
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a novel integral reinforcement learning method is proposed based on value iteration (VI) to design the H-infinity controller for continuous-time nonlinear systems subject to input constraints. To confront the control constraints, a nonquadratic function is introduced to reconstruct the L-2-gain condition for the H-infinity control problem. Then, the VI method is proposed to solve the corresponding Hamilton-Jacobi-Isaacs equation initialized with an arbitrary positive semi-definite value function. Compared with most existing works developed based on policy iteration, the initial admissible control policy is no longer required which results in a more free initial condition. The iterative process of the proposed VI method is analyzed and the convergence to the saddle point solution is proved in a general way. For the implementation of the proposed method, only one neural network is introduced to approximate the iterative value function, which results in a simpler architecture with less computational load compared with utilizing three neural networks. To verify the effectiveness of the VI-based method, two nonlinear cases are presented, respectively.
引用
收藏
页码:3986 / 3995
页数:10
相关论文
共 60 条
[1]   Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].
Abu-Khalaf, M ;
Lewis, FL .
AUTOMATICA, 2005, 41 (05) :779-791
[2]   Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation [J].
Abu-Khalaf, Murad ;
Lewis, Frank L. ;
Huang, Jie .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) :1989-1995
[3]   Neurodynamic programming and zero-sum games for constrained control systems [J].
Abu-Khalaf, Murad ;
Lewis, Frank L. ;
Huang, Jie .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (07) :1243-1252
[4]   Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949
[5]   Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
AUTOMATICA, 2007, 43 (03) :473-481
[6]  
[Anonymous], 1992, APPROXIMATE DYNAMIC
[7]  
[Anonymous], 1998, INTRO REINFORCEMENT
[8]  
Basar T., 1995, H Optimal Control and Related MinimaxDesign Problems: A Dynamic Game Approach
[9]  
Basar T., 1999, DYNAMIC NONCOOPERATI, V23
[10]   Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming [J].
Bertsekas, Dimitri P. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) :500-509