H∞ control of linear discrete-time systems: Off-policy reinforcement learning

被引：231

作者：

Kiumarsi, Bahare ^{[1
]}

Lewis, Frank L. ^{[1
,2
]}

Jiang, Zhong-Ping ^{[3
]}

机构：

[1] Univ Texas Arlington, UTARI, Ft Worth, TX 76118 USA

[2] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang, Peoples R China

[3] New York Univ, Polytech Sch Engn, Dept Elect & Comp Engn, Control & Networks Lab, Brooklyn, NY 11201 USA

来源：

AUTOMATICA | 2017年 / 78卷

基金：

美国国家科学基金会;

关键词：

H-infinity control; Off-policy reinforcement learning; Optimal control; ZERO-SUM GAMES; TRACKING CONTROL; EQUATION; FEEDBACK;

D O I：

10.1016/j.automatica.2016.12.009

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, a model-free solution to the H-infinity control of linear discrete-time systems is presented. The proposed approach employs off-policy reinforcement learning (RL) to solve the game algebraic Riccati equation online using measured data along the system trajectories. Like existing model-free RL algorithms, no knowledge of the system dynamics is required. However, the proposed method has two main advantages. First, the disturbance input does not need to be adjusted in a specific manner. This makes it more practical as the disturbance cannot be specified in most real-world applications. Second, there is no bias as a result of adding a probing noise to the control input to maintain persistence of excitation (PE) condition. Consequently, the convergence of the proposed algorithm is not affected by probing noise. An example of the H-infinity control for an F-16 aircraft is given. It is seen that the convergence of the new off-policy RL algorithm is insensitive to probing noise. (C) 2016 Elsevier Ltd. All rights reserved.

引用

页码：144 / 152

页数：9

共 23 条

[1] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].

Al-Tamimi, Asma ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

AUTOMATICA, 2007, 43 (03) :473-481

[2]

Bernhard P., 1995, H-optimal control and related minimax design problems, V2nd

[3]

Bertsekas D. P., 1996, Optimization and neural computation series

[4]

BRADTKE SJ, 1994, PROCEEDINGS OF THE 1994 AMERICAN CONTROL CONFERENCE, VOLS 1-3, P3475

[5]

Chen Ben M, 2000, ROBUST H CONTROL

[6]

Cornish Christopher John, 1989, (Ph.D. thesis

[7] STATE-SPACE SOLUTIONS TO STANDARD H-2 AND H-INFINITY CONTROL-PROBLEMS [J].

DOYLE, JC ;

GLOVER, K ;

KHARGONEKAR, PP ;

FRANCIS, BA .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1989, 34 (08) :831-847

[8] Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics [J].

Jiang, Yu ;

Jiang, Zhong-Ping .

AUTOMATICA, 2012, 48 (10) :2699-2704

[9] Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics [J].

Kiumarsi, Bahare ;

Lewis, Frank L. ;

Modares, Hamidreza ;

Karimpour, Ali ;

Naghibi-Sistani, Mohammad-Bagher .

AUTOMATICA, 2014, 50 (04) :1167-1175

[10]

Lewis F. L., 2012, OPTIMAL CONTROL

← 1 2 3 →