Finite-horizon Q-learning for discrete-time zero-sum games with application to H∞$$ {H}_{\infty } $$ control

被引:1
作者
Liu, Mingxiang [1 ,2 ]
Cai, Qianqian [1 ,2 ,4 ]
Meng, Wei [1 ,2 ]
Li, Dandan [1 ,2 ]
Fu, Minyue [3 ]
机构
[1] Guangdong Univ Technol, Sch Automat, Guangzhou, Peoples R China
[2] Guangdong Prov Key Lab Intelligent Decis & Coopera, Guangzhou, Peoples R China
[3] Southern Univ Sci & Technol, Dept Mech & Energy Engn, Shenzhen, Peoples R China
[4] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
finite-horizon; H(infinity)control; linear quadratic (LQ) control; Q-learning; zero-sum games; SYSTEMS;
D O I
10.1002/asjc.3027
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we investigate the optimal control strategies for model-free zero-sum games involving the H(infinity )control. The key contribution is the development of a Q-learning algorithm for linear quadratic games without knowing the system dynamics. The finite-horizon setting is more practical than the infinite-horizon setting, but it is difficult to solve the time-varying Riccati equation associated with the finite-horizon setting directly. The proposed algorithm is shown to solve the time-varying Riccati equation iteratively without the use of models, and numerical experiments on aircraft dynamics demonstrate the algorithm's efficiency.
引用
收藏
页码:3160 / 3168
页数:9
相关论文
共 34 条
  • [1] Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control
    Al-Tamimi, Asma
    Lewis, Frank L.
    Abu-Khalaf, Murad
    [J]. AUTOMATICA, 2007, 43 (03) : 473 - 481
  • [2] Anderson B. D. O., 1979, OPTIMAL FILTERING
  • [3] Anderson B. D. O., 2007, Optimal control: linear quadratic methods
  • [4] Basar, 1999, DYNAMIC NONCOOPERATI
  • [5] A DYNAMIC-GAMES APPROACH TO CONTROLLER-DESIGN - DISTURBANCE REJECTION IN DISCRETE-TIME
    BASAR, T
    [J]. PROCEEDINGS OF THE 28TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-3, 1989, : 407 - 414
  • [6] Basar T., 2008, H-infinity optimal control and related minimax design problems: a dynamic game approach
  • [7] Bertsekas DP, 1996, NEURO DYNAMIC PROGRA
  • [8] BRADTKE SJ, 1994, PROCEEDINGS OF THE 1994 AMERICAN CONTROL CONFERENCE, VOLS 1-3, P3475
  • [9] Output Feedback Q-Learning for Linear-Quadratic Discrete-Time Finite-Horizon Control Problems
    Calafiore, Giuseppe C.
    Possieri, Corrado
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (07) : 3274 - 3281
  • [10] Nonfragile H∞ Filter Design for T-S Fuzzy Systems in Standard Form
    Chang, Xiao-Heng
    Yang, Guang-Hong
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2014, 61 (07) : 3448 - 3458