Data-Driven $H_{∞}$ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning

被引：22

作者：

Zhang, Li ^{[1
,2
]}

Fan, Jialu ^{[1
,2
]}

Xue, Wenqian ^{[1
,2
]}

Lopez, Victor G. ^{[3
]}

Li, Jinna ^{[4
]}

Chai, Tianyou ^{[1
,2
]}

Lewis, Frank L. ^{[5
]}

机构：

[1] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China

[2] Northeastern Univ, Int Joint Res Lab Integrated Automat, Shenyang 110819, Peoples R China

[3] Leibniz Univ Hannover, D-30167 Hannover, Germany

[4] Liaoning Petrochem Univ, Sch Informat & Control Engn, Fushun 113001, Peoples R China

[5] Univ Texas Arlington, UTA Res Inst, Arlington, TX 76118 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 07期

关键词：

Heuristic algorithms; Optimal control; Transmission line matrix methods; Process control; Performance analysis; Output feedback; Games; H∞ control; off-policy Q-learning; Q-learning; static output feedback (OPFB); zero-sum game; H-INFINITY CONTROL; ZERO-SUM GAMES; ADAPTIVE OPTIMAL-CONTROL; ALGORITHM; DESIGN;

D O I：

10.1109/TNNLS.2021.3112457

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article develops two novel output feedback (OPFB) Q-learning algorithms, on-policy Q-learning and off-policy Q-learning, to solve $H_{infinity}$ static OPFB control problem of linear discrete-time (DT) systems. The primary contribution of the proposed algorithms lies in a newly developed OPFB control algorithm form for completely unknown systems. Under the premise of satisfying disturbance attenuation conditions, the conditions for the existence of the optimal OPFB solution are given. The convergence of the proposed Q-learning methods, and the difference and equivalence of two algorithms are rigorously proven. Moreover, considering the effects brought by probing noise for the persistence of excitation (PE), the proposed off-policy Q-learning method has the advantage of being immune to probing noise and avoiding biasedness of solution. Simulation results are presented to verify the effectiveness of the proposed approaches.

引用

页码：3553 / 3567

页数：15

共 50 条

[31] Off-Policy Reinforcement Learning for Optimal Preview Tracking Control of Linear Discrete-Time systems with unknown dynamics
Wang, Chao-Ran
Wu, Huai-Ning
2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 1402 - 1407
[32] Output-feedback Q-learning for discrete-time linear H∞ tracking control: A Stackelberg game approach
Ren, Yunxiao
Wang, Qishao
Duan, Zhisheng
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2022, 32 (12) : 6805 - 6828
[33] Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems Using Off-policy Reinforcement Learning
Modares, Hamidreza
Lewis, Frank L.
Jiang, Zhong-Ping
IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (11) : 2401 - 2410
[34] Novel two-dimensional off-policy Q-learning method for output feedback optimal tracking control of batch process with unknown dynamics
Shi, Huiyuan
Yang, Chen
Jiang, Xueying
Su, Chengli
Li, Ping
JOURNAL OF PROCESS CONTROL, 2022, 113 : 29 - 41
[35] Discrete-Time Optimal Control Scheme Based on Q-Learning Algorithm
Wei, Qinglai
Liu, Derong
Song, Ruizhuo
2016 SEVENTH INTERNATIONAL CONFERENCE ON INTELLIGENT CONTROL AND INFORMATION PROCESSING (ICICIP), 2016, : 125 - 130
[36] Data-driven disturbance compensation control for discrete-time systems based on reinforcement learning
Li, Lanyue
Li, Jinna
Cao, Jiangtao
INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2024,
[37] Adaptive optimal output feedback tracking control for unknown discrete-time linear systems using a combined reinforcement Q-learning and internal model method
Sun, Weijie
Zhao, Guangyue
Peng, Yunjian
IET CONTROL THEORY AND APPLICATIONS, 2019, 13 (18) : 3075 - 3086
[38] Optimal tracking control for discrete-time modal persistent dwell time switched systems based on Q-learning
Zhang, Xuewen
Wang, Yun
Xia, Jianwei
Li, Feng
Shen, Hao
OPTIMAL CONTROL APPLICATIONS & METHODS, 2023, 44 (06) : 3327 - 3341
[39] Reinforcement Q-Learning Algorithm for H∞ Tracking Control of Unknown Discrete-Time Linear Systems
Peng, Yunjian
Chen, Qian
Sun, Weijie
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11): : 4109 - 4122
[40] Influence Function Based Off-policy Q-learning Control for Markov Jump Systems
Yuling Zou
Jiwei Wen
Huiwen Xue
Xiaoli Luan
International Journal of Control, Automation and Systems, 2025, 23 (5) : 1411 - 1420

← 1 2 3 4 5 →