Influence Function Based Off-policy Q-learning Control for Markov Jump Systems

被引:0
|
作者
Yuling Zou [1 ]
Jiwei Wen [1 ]
Huiwen Xue [1 ]
Xiaoli Luan [1 ]
机构
[1] Jiangnan University,Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), the School of Internet of Things Engineering
关键词
control; influence function; Markov jump systems; off-policy Q-learning;
D O I
10.1007/s12555-024-0579-8
中图分类号
学科分类号
摘要
This paper presents an off-policy Q-learning approach based on influence function for addressing H∞ control of Markov jump systems. Unlike existing literatures, the mode classification and parallel update method is developed to directly decouple the relationship among matrices across different modes, tackling the most challenging aspect of this issue. Subsequently, we utilize the off-policy algorithm to derive the optimal policy, which allows for efficient learning without the need to follow the current policy being improved. This approach is particularly advantageous as it enables the algorithm to explore and evaluate different policies from historical data, thus circumventing the limitations associated with specific forms of disturbance updates. Moreover, the influence function is employed for data cleansing during the learning process, thereby enabling a more efficient learning period. A numerical example and a DC motor model are presented to illustrate the validity of the proposed method.
引用
收藏
页码:1411 / 1420
页数:9
相关论文
共 50 条
  • [1] Off-policy Q-learning: Optimal tracking control for networked control systems
    Li J.-N.
    Yin Z.-X.
    Kongzhi yu Juece/Control and Decision, 2019, 34 (11): : 2343 - 2349
  • [2] Zero-sum game-based optimal control for discrete-time Markov jump systems: A parallel off-policy Q-learning method
    Wang, Yun
    Fang, Tian
    Kong, Qingkai
    Li, Feng
    APPLIED MATHEMATICS AND COMPUTATION, 2024, 467
  • [3] Robust optimal tracking control for multiplayer systems by off-policy Q-learning approach
    Li, Jinna
    Xiao, Zhenfei
    Li, Ping
    Cao, Jiangtao
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (01) : 87 - 106
  • [4] Off-Policy Q-Learning for Anti-Interference Control of Multi-Player Systems
    Li, Jinna
    Xiao, Zhenfei
    Chai, Tianyou
    Lewis, Frank L.
    Jagannathan, Sarangapani
    IFAC PAPERSONLINE, 2020, 53 (02): : 9189 - 9194
  • [5] H∞ Control for Discrete-Time Multi-Player Systems via Off-Policy Q-Learning
    Li, Jinna
    Xiao, Zhenfei
    IEEE ACCESS, 2020, 8 (08): : 28831 - 28846
  • [6] Data-Driven $H_{∞}$ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning
    Zhang, Li
    Fan, Jialu
    Xue, Wenqian
    Lopez, Victor G.
    Li, Jinna
    Chai, Tianyou
    Lewis, Frank L.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (07) : 3553 - 3567
  • [7] Optimal Control for Interconnected Multi-Area Power Systems With Unknown Dynamics: An Off-Policy Q-Learning Method
    Wang, Jing
    Mi, Xuanrui
    Shen, Hao
    Park, Ju H.
    Shi, Kaibo
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (05) : 2849 - 2853
  • [8] Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning
    Li, Jinna
    Xiao, Zhenfei
    Li, Ping
    IEEE ACCESS, 2019, 7 : 134647 - 134659
  • [9] Off-policy Q-learning-based Tracking Control for Stochastic Linear Discrete-Time Systems
    Liu, Xuantong
    Zhang, Lei
    Peng, Yunjian
    2022 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2022, : 252 - 256
  • [10] Reinforcement Q-learning algorithm for H∞ tracking control of discrete-time Markov jump systems
    Shi, Jiahui
    He, Dakuo
    Zhang, Qiang
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2025, 56 (03) : 502 - 523