H∞ Tracking learning control for discrete-time Markov jump systems: A parallel off-policy reinforcement learning

被引:1
|
作者
Zhang, Xuewen [1 ]
Xia, Jianwei [2 ]
Wang, Jing [1 ]
Chen, Xiangyong [3 ]
Shen, Hao [1 ]
机构
[1] Anhui Univ Technol, Sch Elect & Informat Engn, China Int Sci & Technol Cooperat Base Intelligent, Maanshan 243002, Peoples R China
[2] Liaocheng Univ, Sch Math Sci, Maanshan 252059, Peoples R China
[3] Linyi Univ, Sch Automat & Elect Engn, Linyi 276005, Peoples R China
来源
JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS | 2023年 / 360卷 / 18期
基金
中国国家自然科学基金;
关键词
FEEDBACK-CONTROL; LINEAR-SYSTEMS; DESIGN;
D O I
10.1016/j.jfranklin.2023.10.008
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper deals with the H infinity tracking control problem for a class of linear discrete-time Markov jump systems, in which the knowledge of system dynamics is not required. First, combined with reinforcement learning, a novel Bellman equation and the augmented coupled game algebraic Riccati equation are presented to derived the optimal control policy for the augmented discrete-time Markov jump systems. Moreover, based on the augmented system, a newly constructed system is given to collect the input and output data, which solves the problem that the coupling term in the discrete-time Markov jump systems is difficult to solve. Subsequently, a novel model-free algorithm is designed that does not need the dynamic information of the original system. Finally, a numerical example is given to verify the effectiveness of the proposed approach.
引用
收藏
页码:14878 / 14890
页数:13
相关论文
共 50 条
  • [1] Reinforcement Q-learning algorithm for H∞ tracking control of discrete-time Markov jump systems
    Shi, Jiahui
    He, Dakuo
    Zhang, Qiang
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2025, 56 (03) : 502 - 523
  • [2] Fuzzy-Based Adaptive Optimization of Unknown Discrete-Time Nonlinear Markov Jump Systems With Off-Policy Reinforcement Learning
    Fang, Haiyang
    Tu, Yidong
    Wang, Hai
    He, Shuping
    Liu, Fei
    Ding, Zhengtao
    Cheng, Shing Shin
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2022, 30 (12) : 5276 - 5290
  • [3] Non-zero-sum games of discrete-time Markov jump systems with unknown dynamics: An off-policy reinforcement learning method
    Zhang, Xuewen
    Shen, Hao
    Li, Feng
    Wang, Jing
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2024, 34 (02) : 949 - 968
  • [4] Zero-sum game-based optimal control for discrete-time Markov jump systems: A parallel off-policy Q-learning method
    Wang, Yun
    Fang, Tian
    Kong, Qingkai
    Li, Feng
    APPLIED MATHEMATICS AND COMPUTATION, 2024, 467
  • [5] H∞ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning
    Modares, Hamidreza
    Lewis, Frank L.
    Jiang, Zhong-Ping
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (10) : 2550 - 2562
  • [6] Off-Policy Interleaved Q-Learning: Optimal Control for Affine Nonlinear Discrete-Time Systems
    Li, Jinna
    Chai, Tianyou
    Lewis, Frank L.
    Ding, Zhengtao
    Jiang, Yi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (05) : 1308 - 1320
  • [7] Data-Driven Robust Control of Discrete-Time Uncertain Linear Systems via Off-Policy Reinforcement Learning
    Yang, Yongliang
    Guo, Zhishan
    Xiong, Haoyi
    Ding, Da-Wei
    Yin, Yixin
    Wunsch, Donald C.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (12) : 3735 - 3747
  • [8] H∞ Control for Discrete-Time Multi-Player Systems via Off-Policy Q-Learning
    Li, Jinna
    Xiao, Zhenfei
    IEEE ACCESS, 2020, 8 (08): : 28831 - 28846
  • [9] Off-policy integral reinforcement learning optimal tracking control for continuous-time chaotic systems
    Wei Qing-Lai
    Song Rui-Zhuo
    Sun Qiu-Ye
    Xiao Wen-Dong
    CHINESE PHYSICS B, 2015, 24 (09)
  • [10] Actor-Critic Off-Policy Learning for Optimal Control of Multiple-Model Discrete-Time Systems
    Skach, Jan
    Kiumarsi, Bahare
    Lewis, Frank L.
    Straka, Ondrej
    IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (01) : 29 - 40