Influence Function Based Off-policy Q-learning Control for Markov Jump Systems

被引：0

作者：

Yuling Zou ^{[1
]}

Jiwei Wen ^{[1
]}

Huiwen Xue ^{[1
]}

Xiaoli Luan ^{[1
]}

机构：

[1] Jiangnan University,Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), the School of Internet of Things Engineering

来源：

International Journal of Control, Automation and Systems | 2025年 / 23卷 / 5期

关键词：

control; influence function; Markov jump systems; off-policy Q-learning;

D O I：

10.1007/s12555-024-0579-8

中图分类号：

学科分类号：

摘要：

This paper presents an off-policy Q-learning approach based on influence function for addressing H∞ control of Markov jump systems. Unlike existing literatures, the mode classification and parallel update method is developed to directly decouple the relationship among matrices across different modes, tackling the most challenging aspect of this issue. Subsequently, we utilize the off-policy algorithm to derive the optimal policy, which allows for efficient learning without the need to follow the current policy being improved. This approach is particularly advantageous as it enables the algorithm to explore and evaluate different policies from historical data, thus circumventing the limitations associated with specific forms of disturbance updates. Moreover, the influence function is employed for data cleansing during the learning process, thereby enabling a more efficient learning period. A numerical example and a DC motor model are presented to illustrate the validity of the proposed method.

引用

页码：1411 / 1420

页数：9

共 50 条

[1] Off-policy Q-learning: Optimal tracking control for networked control systems
Li J.-N.
Yin Z.-X.
Kongzhi yu Juece/Control and Decision, 2019, 34 (11): : 2343 - 2349
[2] Zero-sum game-based optimal control for discrete-time Markov jump systems: A parallel off-policy Q-learning method
Wang, Yun
Fang, Tian
Kong, Qingkai
Li, Feng
APPLIED MATHEMATICS AND COMPUTATION, 2024, 467
[3] Robust optimal tracking control for multiplayer systems by off-policy Q-learning approach
Li, Jinna
Xiao, Zhenfei
Li, Ping
Cao, Jiangtao
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2021, 31 (01) : 87 - 106
[4] Off-Policy Q-Learning for Anti-Interference Control of Multi-Player Systems
Li, Jinna
Xiao, Zhenfei
Chai, Tianyou
Lewis, Frank L.
Jagannathan, Sarangapani
IFAC PAPERSONLINE, 2020, 53 (02): : 9189 - 9194
[5] H∞ Control for Discrete-Time Multi-Player Systems via Off-Policy Q-Learning
Li, Jinna
Xiao, Zhenfei
IEEE ACCESS, 2020, 8 (08): : 28831 - 28846
[6] Data-Driven $H_{∞}$ Optimal Output Feedback Control for Linear Discrete-Time Systems Based on Off-Policy Q-Learning
Zhang, Li
Fan, Jialu
Xue, Wenqian
Lopez, Victor G.
Li, Jinna
Chai, Tianyou
Lewis, Frank L.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (07) : 3553 - 3567
[7] Optimal Control for Interconnected Multi-Area Power Systems With Unknown Dynamics: An Off-Policy Q-Learning Method
Wang, Jing
Mi, Xuanrui
Shen, Hao
Park, Ju H.
Shi, Kaibo
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (05) : 2849 - 2853
[8] Discrete-Time Multi-Player Games Based on Off-Policy Q-Learning
Li, Jinna
Xiao, Zhenfei
Li, Ping
IEEE ACCESS, 2019, 7 : 134647 - 134659
[9] Off-policy Q-learning-based Tracking Control for Stochastic Linear Discrete-Time Systems
Liu, Xuantong
Zhang, Lei
Peng, Yunjian
2022 4TH INTERNATIONAL CONFERENCE ON CONTROL AND ROBOTICS, ICCR, 2022, : 252 - 256
[10] Reinforcement Q-learning algorithm for H∞ tracking control of discrete-time Markov jump systems
Shi, Jiahui
He, Dakuo
Zhang, Qiang
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2025, 56 (03) : 502 - 523

← 1 2 3 4 5 →