Robust Control of Unknown Observable Nonlinear Systems Solved as a Zero-Sum Game

被引:35
作者
Radac, Mircea-Bogdan [1 ]
Lala, Timotei [1 ]
机构
[1] Politehn Univ Timisoara, Dept Automat & Appl Informat, Timisoara 300223, Romania
关键词
Mathematical model; Robust control; Games; Optimal control; Linear systems; Game theory; Roads; Active suspension system; approximate dynamic programming; neural networks; optimal control; reinforcement learning; state feedback; zero-sum two-player games; STATE-FEEDBACK CONTROL; DISCRETE-TIME-SYSTEMS; H-INFINITY CONTROL; VEHICLE SUSPENSION; LEARNING ALGORITHM; DESIGN;
D O I
10.1109/ACCESS.2020.3040185
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An optimal robust control solution for general nonlinear systems with unknown but observable dynamics is advanced here. The underlying Hamilton-Jacobi-Isaacs (HJI) equation of the corresponding zero-sum two-player game (ZS-TP-G) is learned using a Q-learning-based approach employing only input-output system measurements, assuming system observability. An equivalent virtual state-space model is built from the system's input-output samples and it is shown that controlling the former implies controlling the latter. Since the existence of a saddle-point solution to the ZS-TP-G is assumed unverifiable, the solution is derived in terms of upper-optimal and lower-optimal controllers. The learning convergence is theoretically ensured while practical implementation is performed using neural networks that provide scalability to the control problem dimension and automatic feature selection. The learning strategy is checked on an active suspension system, a good candidate for the robust control problem with respect to road profile disturbance rejection.
引用
收藏
页码:214153 / 214165
页数:13
相关论文
共 54 条
[1]   Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation [J].
Abu-Khalaf, Murad ;
Lewis, Frank L. ;
Huang, Jie .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) :1989-1995
[2]   Designing active vehicle suspension system using critic-based control strategy [J].
Akraminia, Mahdi ;
Tatari, Milad ;
Fard, Mohammad ;
Jazar, Reza N. .
Nonlinear Engineering, 2015, 4 (03) :141-154
[3]   The Boundedness Conditions for Model-Free HDP(lambda) [J].
Al-Dabooni, Seaar ;
Wunsch, Donald .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (07) :1928-1942
[4]   Adaptive critic designs for discrete-time zero-sum games with application to H∞ control [J].
Al-Tamimi, Asma ;
Abu-Khalaf, Murad ;
Lewis, Frank L. .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (01) :240-247
[5]  
[Anonymous], 2018, IEEE ROBOT AUTOM LET, DOI DOI 10.1109/LRA.2018.2800101
[6]  
Basar T., 1995, H optimal control and related minimax design problems: A dynamic game approach
[7]   Vibration control of a nonlinear quarter-car active suspension system by reinforcement learning [J].
Bucak, I. O. ;
Oz, H. R. .
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2012, 43 (06) :1177-1190
[8]   Reinforcement learning for control: Performance, stability, and deep approximators [J].
Busoniu, Lucian ;
de Bruin, Tim ;
Tolic, Domagoj ;
Kober, Jens ;
Palunko, Ivana .
ANNUAL REVIEWS IN CONTROL, 2018, 46 :8-28
[9]   Approximate Dynamic Programming: Combining Regional and Local State Following Approximations [J].
Deptula, Patryk ;
Rosenfeld, Joel A. ;
Kamalapurkar, Rushikesh ;
Dixon, Warren E. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) :2154-2166
[10]   Actor-Critic-Identifier Structure-Based Decentralized Neuro-Optimal Control of Modular Robot Manipulators With Environmental Collisions [J].
Dong, Bo ;
An, Tianjiao ;
Zhou, Fan ;
Liu, Keping ;
Yu, Weibo ;
Li, Yuanchun .
IEEE ACCESS, 2019, 7 :96148-96165