Finite-horizon Q-learning for discrete-time zero-sum games with application to H∞$$ {H}_{\infty } $$ control

被引：1

作者：

Liu, Mingxiang ^{[1
,2
]}

Cai, Qianqian ^{[1
,2
,4
]}

Meng, Wei ^{[1
,2
]}

Li, Dandan ^{[1
,2
]}

Fu, Minyue ^{[3
]}

机构：

[1] Guangdong Univ Technol, Sch Automat, Guangzhou, Peoples R China

[2] Guangdong Prov Key Lab Intelligent Decis & Coopera, Guangzhou, Peoples R China

[3] Southern Univ Sci & Technol, Dept Mech & Energy Engn, Shenzhen, Peoples R China

[4] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Guangdong, Peoples R China

来源：

ASIAN JOURNAL OF CONTROL | 2023年 / 25卷 / 04期

基金：

中国国家自然科学基金;

关键词：

finite-horizon; H(infinity)control; linear quadratic (LQ) control; Q-learning; zero-sum games; SYSTEMS;

D O I：

10.1002/asjc.3027

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we investigate the optimal control strategies for model-free zero-sum games involving the H(infinity )control. The key contribution is the development of a Q-learning algorithm for linear quadratic games without knowing the system dynamics. The finite-horizon setting is more practical than the infinite-horizon setting, but it is difficult to solve the time-varying Riccati equation associated with the finite-horizon setting directly. The proposed algorithm is shown to solve the time-varying Riccati equation iteratively without the use of models, and numerical experiments on aircraft dynamics demonstrate the algorithm's efficiency.

引用

页码：3160 / 3168

页数：9

共 34 条

[1] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
Al-Tamimi, Asma
Lewis, Frank L.
Abu-Khalaf, Murad
[J]. AUTOMATICA, 2007, 43 (03) : 473 - 481
[2] Anderson B. D. O., 1979, OPTIMAL FILTERING
[3] Anderson B. D. O., 2007, Optimal control: linear quadratic methods
[4] Basar, 1999, DYNAMIC NONCOOPERATI
[5] A DYNAMIC-GAMES APPROACH TO CONTROLLER-DESIGN - DISTURBANCE REJECTION IN DISCRETE-TIME
BASAR, T
[J]. PROCEEDINGS OF THE 28TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-3, 1989, : 407 - 414
[6] Basar T., 2008, H-infinity optimal control and related minimax design problems: a dynamic game approach
[7] Bertsekas DP, 1996, NEURO DYNAMIC PROGRA
[8] BRADTKE SJ, 1994, PROCEEDINGS OF THE 1994 AMERICAN CONTROL CONFERENCE, VOLS 1-3, P3475
[9] Output Feedback Q-Learning for Linear-Quadratic Discrete-Time Finite-Horizon Control Problems
Calafiore, Giuseppe C.
Possieri, Corrado
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (07) : 3274 - 3281
[10] Nonfragile H∞ Filter Design for T-S Fuzzy Systems in Standard Form
Chang, Xiao-Heng
Yang, Guang-Hong
[J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2014, 61 (07) : 3448 - 3458

← 1 2 3 4 →