Value Iteration-Based H∞ Controller Design for Continuous-Time Nonlinear Systems Subject to Input Constraints

被引：29

作者：

Zhang, Huaguang ^{[1
,2
]}

Xiao, Geyang ^{[2
]}

Liu, Yang ^{[2
]}

Liu, Lei ^{[3
]}

机构：

[1] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110004, Peoples R China

[2] Northeastern Univ, Sch Informat Sci & Engn, Shenyang 110004, Peoples R China

[3] Liaoning Univ Technol, Coll Sci, Jinzhou 121001, Peoples R China

来源：

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2020年 / 50卷 / 11期

基金：

中国国家自然科学基金; 国家高技术研究发展计划(863计划);

关键词：

Approximate dynamic programming; H-infinity control; reinforcement learning (RL); value iteration (VI); ZERO-SUM GAMES; STATE-FEEDBACK CONTROL; POLICY UPDATE ALGORITHM; QUADRATIC GAMES; LINEAR-SYSTEMS; EQUATION;

D O I：

10.1109/TSMC.2018.2853091

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a novel integral reinforcement learning method is proposed based on value iteration (VI) to design the H-infinity controller for continuous-time nonlinear systems subject to input constraints. To confront the control constraints, a nonquadratic function is introduced to reconstruct the L-2-gain condition for the H-infinity control problem. Then, the VI method is proposed to solve the corresponding Hamilton-Jacobi-Isaacs equation initialized with an arbitrary positive semi-definite value function. Compared with most existing works developed based on policy iteration, the initial admissible control policy is no longer required which results in a more free initial condition. The iterative process of the proposed VI method is analyzed and the convergence to the saddle point solution is proved in a general way. For the implementation of the proposed method, only one neural network is introduced to approximate the iterative value function, which results in a simpler architecture with less computational load compared with utilizing three neural networks. To verify the effectiveness of the VI-based method, two nonlinear cases are presented, respectively.

引用

页码：3986 / 3995

页数：10

共 60 条

[1] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].

Abu-Khalaf, M ;

Lewis, FL .

AUTOMATICA, 2005, 41 (05) :779-791

[2] Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation [J].

Abu-Khalaf, Murad ;

Lewis, Frank L. ;

Huang, Jie .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) :1989-1995

[3] Neurodynamic programming and zero-sum games for constrained control systems [J].

Abu-Khalaf, Murad ;

Lewis, Frank L. ;

Huang, Jie .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (07) :1243-1252

[4] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949

[5] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

AUTOMATICA, 2007, 43 (03) :473-481

[6]

[Anonymous], 1992, APPROXIMATE DYNAMIC

[7]

[Anonymous], 1998, INTRO REINFORCEMENT

[8]

Basar T., 1995, H Optimal Control and Related MinimaxDesign Problems: A Dynamic Game Approach

[9]

Basar T., 1999, DYNAMIC NONCOOPERATI, V23

[10] Value and Policy Iterations in Optimal Control and Adaptive Dynamic Programming [J].

Bertsekas, Dimitri P. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2017, 28 (03) :500-509

← 1 2 3 4 5 6 →