H∞ control with constrained input for completely unknown nonlinear systems using data-driven reinforcement learning method

被引：41

作者：

Jiang, He ^{[1
]}

Zhang, Huaguang ^{[1
]}

Luo, Yanhong ^{[1
]}

Cui, Xiaohong ^{[1
]}

机构：

[1] Northeastern Univ, Coll Informat Sci & Engn, Box 134, Shenyang 110819, Peoples R China

来源：

NEUROCOMPUTING | 2017年 / 237卷

基金：

中国国家自然科学基金;

关键词：

Reinforcement learning; Adaptive dynamic programming; Data-driven; Neural networks; OPTIMAL TRACKING CONTROL; DYNAMIC-PROGRAMMING ALGORITHM; DIFFERENTIAL GRAPHICAL GAMES; POLICY UPDATE ALGORITHM; ZERO-SUM GAME; FEEDBACK-CONTROL; CONTROL DESIGN; TIME-SYSTEMS; ITERATION; SYNCHRONIZATION;

D O I：

10.1016/j.neucom.2016.11.041

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper investigates the H-infinity control problem for nonlinear systems with completely unknown dynamics and constrained control input by utilizing a novel data-driven reinforcement learning method. It is known that nonlinear H-infinity control problem relies on the solution of Hamilton-Jacobi-Isaacs (HJI) equation, which is essentially a nonlinear partial differential equation and generally impossible to be solved analytically. In order to overcome this difficulty, firstly, we propose a model-based simultaneoui policy update algorithm to learn the solution of HJI equation iteratively and provide its convergence proof. Then, based on this model-based method, we develop a data-driven model-free algorithm, which only requires the real system sampling data generated by arbitrary different control inputs and external disturbances instead of accurate system models, and prove that these two algorithms are equivalent. To implement this model-free algorithm, three neural networks (NNs) are employed to approximate the iterative performance index function, control policy and disturbance policy, respectively, and the least-square approach is used to minimize the NN approximation residual errors. Finally, the proposed scheme is tested on the rotational/translational actuator nonlinear system.

引用

页码：226 / 234

页数：9

共 52 条

[1] Multi-agent discrete-time graphical games and reinforcement learning solutions [J].

Abouheaf, Mohammed I. ;

Lewis, Frank L. ;

Vamvoudakis, Kyriakos G. ;

Haesaert, Sofie ;

Babuska, Robert .

AUTOMATICA, 2014, 50 (12) :3038-3053

[2] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach [J].

Abu-Khalaf, M ;

Lewis, FL .

AUTOMATICA, 2005, 41 (05) :779-791

[3] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04) :943-949

[4] Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence [J].

Dierks, Travis ;

Thumati, Balaje T. ;

Jagannathan, S. .

NEURAL NETWORKS, 2009, 22 (5-6) :851-860

[5]

He H., 2013, Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, P78

[6] A three-network architecture for on-line learning and optimization based on adaptive dynamic programming [J].

He, Haibo ;

Ni, Zhen ;

Fu, Jian .

NEUROCOMPUTING, 2012, 78 (01) :3-13

[7] Actor-Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems [J].

Kiumarsi, Bahare ;

Lewis, Frank L. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) :140-151

[8] Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control [J].

Lewis, Frank L. ;

Vrabie, Draguna .

IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2009, 9 (03) :32-50

[9] Neural-network-based decentralized control of continuous-time nonlinear interconnected systems with unknown dynamics [J].

Liu, Derong ;

Li, Chao ;

Li, Hongliang ;

Wang, Ding ;

Ma, Hongwen .

NEUROCOMPUTING, 2015, 165 :90-98

[10] Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems [J].

Liu, Derong ;

Wei, Qinglai .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (03) :621-634

← 1 2 3 4 5 6 →