H∞ control of linear discrete-time systems: Off-policy reinforcement learning

被引:231
作者
Kiumarsi, Bahare [1 ]
Lewis, Frank L. [1 ,2 ]
Jiang, Zhong-Ping [3 ]
机构
[1] Univ Texas Arlington, UTARI, Ft Worth, TX 76118 USA
[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang, Peoples R China
[3] New York Univ, Polytech Sch Engn, Dept Elect & Comp Engn, Control & Networks Lab, Brooklyn, NY 11201 USA
基金
美国国家科学基金会;
关键词
H-infinity control; Off-policy reinforcement learning; Optimal control; ZERO-SUM GAMES; TRACKING CONTROL; EQUATION; FEEDBACK;
D O I
10.1016/j.automatica.2016.12.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, a model-free solution to the H-infinity control of linear discrete-time systems is presented. The proposed approach employs off-policy reinforcement learning (RL) to solve the game algebraic Riccati equation online using measured data along the system trajectories. Like existing model-free RL algorithms, no knowledge of the system dynamics is required. However, the proposed method has two main advantages. First, the disturbance input does not need to be adjusted in a specific manner. This makes it more practical as the disturbance cannot be specified in most real-world applications. Second, there is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation (PE) condition. Consequently, the convergence of the proposed algorithm is not affected by probing noise. An example of the H-infinity control for an F-16 aircraft is given. It is seen that the convergence of the new off-policy RL algorithm is insensitive to probing noise. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:144 / 152
页数:9
相关论文
共 23 条
[1]   Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
AUTOMATICA, 2007, 43 (03) :473-481
[2]  
Bernhard P., 1995, H-optimal control and related minimax design problems, V2nd
[3]  
Bertsekas D. P., 1996, Optimization and neural computation series
[4]  
BRADTKE SJ, 1994, PROCEEDINGS OF THE 1994 AMERICAN CONTROL CONFERENCE, VOLS 1-3, P3475
[5]  
Chen Ben M, 2000, ROBUST H CONTROL
[6]  
Cornish Christopher John, 1989, (Ph.D. thesis
[7]   STATE-SPACE SOLUTIONS TO STANDARD H-2 AND H-INFINITY CONTROL-PROBLEMS [J].
DOYLE, JC ;
GLOVER, K ;
KHARGONEKAR, PP ;
FRANCIS, BA .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1989, 34 (08) :831-847
[8]   Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics [J].
Jiang, Yu ;
Jiang, Zhong-Ping .
AUTOMATICA, 2012, 48 (10) :2699-2704
[9]   Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics [J].
Kiumarsi, Bahare ;
Lewis, Frank L. ;
Modares, Hamidreza ;
Karimpour, Ali ;
Naghibi-Sistani, Mohammad-Bagher .
AUTOMATICA, 2014, 50 (04) :1167-1175
[10]  
Lewis F. L., 2012, OPTIMAL CONTROL