Data-Driven $H_{∞}$ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

被引：22

作者：

Zhang, Li ^{[1
,2
]}

Fan, Jialu ^{[1
,2
]}

Xue, Wenqian ^{[1
,2
]}

Lopez, Victor G. ^{[3
]}

Li, Jinna ^{[4
]}

Chai, Tianyou ^{[1
,2
]}

Lewis, Frank L. ^{[5
]}

机构：

[1] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China

[2] Northeastern Univ, Int Joint Res Lab Integrated Automat, Shenyang 110819, Peoples R China

[3] Leibniz Univ Hannover, D-30167 Hannover, Germany

[4] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun 113001, Peoples R China

[5] Univ Texas Arlington, UTA Res Inst, Arlington, TX 76118 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 07期

关键词：

Heuristic algorithms; Optimal control; Transmission line matrix methods; Process control; Performance analysis; Output feedback; Games; H∞ control; off-policy Q-learning; Q-learning; static output feedback (OPFB); zero-sum game; H-INFINITY CONTROL; ZERO-SUM GAMES; ADAPTIVE OPTIMAL-CONTROL; ALGORITHM; DESIGN;

D O I：

10.1109/TNNLS.2021.3112457

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve $H_{infinity}$ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.

引用

页码：3553 / 3567

页数：15

共 50 条

[21] Improved Q-Learning Method for Linear Discrete-Time Systems
Chen, Jian
Wang, Jinhua
Huang, Jie
PROCESSES, 2020, 8 (03)
[22] Minimax Q-learning design for H∞ control of linear discrete-time systems
Li, Xinxing
Xi, Lele
Zha, Wenzhong
Peng, Zhihong
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2022, 23 (03) : 438 - 451
[23] Output Feedback Reinforcement Q-Learning Control for the Discrete-Time Linear Quadratic Regulator Problem
Rizvi, Syed Ali Asad
Lin, Zongli
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
[24] Off-policy Q-learning: Optimal tracking control for networked control systems
Li J.-N.
Yin Z.-X.
Kongzhi yu Juece/Control and Decision, 2019, 34 (11): : 2343 - 2349
[25] H∞output feedback fault-tolerant control of industrial processes based on zero-sum games and off-policy Q-learning
Wang, Limin
Jia, Linzhu
Zhang, Ridong
Gao, Furong
COMPUTERS & CHEMICAL ENGINEERING, 2023, 179
[26] Output feedback fault-tolerant Q-learning for discrete-time linear systems with actuator faults
Rafiee, Sajad
Kankashvar, Mohammadrasoul
Bolandi, Hossein
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 138
[27] Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the H? control
Liu, Mingxiang
Cai, Qianqian
Li, Dandan
Meng, Wei
Fu, Minyue
NEUROCOMPUTING, 2023, 529 : 48 - 55
[28] Robust optimal tracking control for multiplayer systems by off-policy Q-learning approach
Li, Jinna
Xiao, Zhenfei
Li, Ping
Cao, Jiangtao
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (01) : 87 - 106
[29] An ADDHP-based Q-learning algorithm for optimal tracking control of linear discrete-time systems with unknown dynamics
Mu, Chaoxu
Zhao, Qian
Sun, Changyin
Gao, Zhongke
APPLIED SOFT COMPUTING, 2019, 82
[30] Data-Driven Nonzero-Sum Game for Discrete-Time Systems Using Off-Policy Reinforcement Learning
Yang, Yongliang
Zhang, Sen
Dong, Jie
Yin, Yixin
IEEE ACCESS, 2020, 8 : 14074 - 14088

← 1 2 3 4 5 →