Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game

被引:29
作者
Radac, Mircea-Bogdan [1 ]
Lala, Timotei [1 ]
机构
[1] Politehn Univ Timisoara, Dept Automat & Appl Informat, Timisoara 300223, Romania
来源
IEEE ACCESS | 2020年 / 8卷 / 08期
关键词
Mathematical model; Robust control; Games; Optimal control; Linear systems; Game theory; Roads; Active suspension system; approximate dynamic programming; neural networks; optimal control; reinforcement learning; state feedback; zero-sum two-player games; STATE-FEEDBACK CONTROL; DISCRETE-TIME-SYSTEMS; H-INFINITY CONTROL; VEHICLE SUSPENSION; LEARNING ALGORITHM; DESIGN;
D O I
10.1109/ACCESS.2020.3040185
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An optimal robust control solution for general nonlinear systems with unknown but observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) equation of the corresponding zero-sum two-player game (ZS-TP-G) is learned using a Q-learning-based approach employing only input-output system measurements, assuming system observability. An equivalent virtual state-space model is built from the system's input-output samples and it is shown that controlling the former implies controlling the latter. Since the existence of a saddle-point solution to the ZS-TP-G is assumed unverifiable, the solution is derived in terms of upper-optimal and lower-optimal controllers. The learning convergence is theoretically ensured while practical implementation is performed using neural networks that provide scalability to the control problem dimension and automatic feature selection. The learning strategy is checked on an active suspension system, a good candidate for the robust control problem with respect to road profile disturbance rejection.
引用
收藏
页码:214153 / 214165
页数:13
相关论文
共 54 条
  • [1] Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation
    Abu-Khalaf, Murad
    Lewis, Frank L.
    Huang, Jie
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) : 1989 - 1995
  • [2] Akraminia Mahdi, 2015, Nonlinear Engineering. Modeling and Application, V4, P141, DOI 10.1515/nleng-2015-0004
  • [3] The Boundedness Conditions for Model-Free HDP(lambda)
    Al-Dabooni, Seaar
    Wunsch, Donald
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (07) : 1928 - 1942
  • [4] Adaptive critic designs for discrete-time zero-sum games with application to H∞ control
    Al-Tamimi, Asma
    Abu-Khalaf, Murad
    Lewis, Frank L.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (01): : 240 - 247
  • [5] Basar T., 1995, H Optimal Control and Related MinimaxDesign Problems: A Dynamic Game Approach
  • [6] Vibration control of a nonlinear quarter-car active suspension system by reinforcement learning
    Bucak, I. O.
    Oz, H. R.
    [J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2012, 43 (06) : 1177 - 1190
  • [7] Reinforcement learning for control: Performance, stability, and deep approximators
    Busoniu, Lucian
    de Bruin, Tim
    Tolic, Domagoj
    Kober, Jens
    Palunko, Ivana
    [J]. ANNUAL REVIEWS IN CONTROL, 2018, 46 : 8 - 28
  • [8] de Bruin T., 2018, IEEE ROBOT AUTOM LET, V3, P1394, DOI DOI 10.1109/LRA.2018.2800101
  • [9] Approximate Dynamic Programming: Combining Regional and Local State Following Approximations
    Deptula, Patryk
    Rosenfeld, Joel A.
    Kamalapurkar, Rushikesh
    Dixon, Warren E.
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) : 2154 - 2166
  • [10] Actor-Critic-Identifier Structure-Based Decentralized Neuro-Optimal Control of Modular Robot Manipulators With Environmental Collisions
    Dong, Bo
    An, Tianjiao
    Zhou, Fan
    Liu, Keping
    Yu, Weibo
    Li, Yuanchun
    [J]. IEEE ACCESS, 2019, 7 : 96148 - 96165