H∞ control with constrained input for completely unknown nonlinear systems using data-driven reinforcement learning method

被引:40
作者
Jiang, He [1 ]
Zhang, Huaguang [1 ]
Luo, Yanhong [1 ]
Cui, Xiaohong [1 ]
机构
[1] Northeastern Univ, Coll Informat Sci & Engn, Box 134, Shenyang 110819, Peoples R China
基金
中国国家自然科学基金;
关键词
Reinforcement learning; Adaptive dynamic programming; Data-driven; Neural networks; OPTIMAL TRACKING CONTROL; DYNAMIC-PROGRAMMING ALGORITHM; DIFFERENTIAL GRAPHICAL GAMES; POLICY UPDATE ALGORITHM; ZERO-SUM GAME; FEEDBACK-CONTROL; CONTROL DESIGN; TIME-SYSTEMS; ITERATION; SYNCHRONIZATION;
D O I
10.1016/j.neucom.2016.11.041
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper investigates the H-infinity control problem for nonlinear systems with completely unknown dynamics and constrained control input by utilizing a novel data-driven reinforcement learning method. It is known that nonlinear H-infinity control problem relies on the solution of Hamilton-Jacobi-Isaacs (HJI) equation, which is essentially a nonlinear partial differential equation and generally impossible to be solved analytically. In order to overcome this difficulty, firstly, we propose a model-based simultaneoui policy update algorithm to learn the solution of HJI equation iteratively and provide its convergence proof. Then, based on this model-based method, we develop a data-driven model-free algorithm, which only requires the real system sampling data generated by arbitrary different control inputs and external disturbances instead of accurate system models, and prove that these two algorithms are equivalent. To implement this model-free algorithm, three neural networks (NNs) are employed to approximate the iterative performance index function, control policy and disturbance policy, respectively, and the least-square approach is used to minimize the NN approximation residual errors. Finally, the proposed scheme is tested on the rotational/translational actuator nonlinear system.
引用
收藏
页码:226 / 234
页数:9
相关论文
共 52 条
  • [1] Multi-agent discrete-time graphical games and reinforcement learning solutions
    Abouheaf, Mohammed I.
    Lewis, Frank L.
    Vamvoudakis, Kyriakos G.
    Haesaert, Sofie
    Babuska, Robert
    [J]. AUTOMATICA, 2014, 50 (12) : 3038 - 3053
  • [2] Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach
    Abu-Khalaf, M
    Lewis, FL
    [J]. AUTOMATICA, 2005, 41 (05) : 779 - 791
  • [3] Discrete-time nonlinear HJB solution using approximate dynamic programming: Convergence proof
    Al-Tamimi, Asma
    Lewis, Frank L.
    Abu-Khalaf, Murad
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (04): : 943 - 949
  • [4] Optimal control of unknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence
    Dierks, Travis
    Thumati, Balaje T.
    Jagannathan, S.
    [J]. NEURAL NETWORKS, 2009, 22 (5-6) : 851 - 860
  • [5] He H., 2013, Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, P78
  • [6] A three-network architecture for on-line learning and optimization based on adaptive dynamic programming
    He, Haibo
    Ni, Zhen
    Fu, Jian
    [J]. NEUROCOMPUTING, 2012, 78 (01) : 3 - 13
  • [7] Actor-Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems
    Kiumarsi, Bahare
    Lewis, Frank L.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) : 140 - 151
  • [8] Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control
    Lewis, Frank L.
    Vrabie, Draguna
    [J]. IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2009, 9 (03) : 32 - 50
  • [9] Neural-network-based decentralized control of continuous-time nonlinear interconnected systems with unknown dynamics
    Liu, Derong
    Li, Chao
    Li, Hongliang
    Wang, Ding
    Ma, Hongwen
    [J]. NEUROCOMPUTING, 2015, 165 : 90 - 98
  • [10] Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems
    Liu, Derong
    Wei, Qinglai
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (03) : 621 - 634