Model-free extended Q-learning method for H∞, output tracking control of networked control systems with network delays and packet loss

被引:0
作者
Hao, Longyan [1 ]
Wang, Chaoli [1 ]
Liang, Dong [1 ]
Li, Shihua [2 ]
机构
[1] Univ Shanghai Sci & Technol, Dept Control Sci & Engn, Shanghai 200093, Peoples R China
[2] Southeast Univ, Sch Automat, Nanjing 211189, Peoples R China
基金
中国国家自然科学基金;
关键词
Q-learning algorithm; Networked control systems; H(infinity)output tracking control; ZERO-SUM GAMES; DISCRETE-TIME-SYSTEMS; ADAPTIVE OPTIMAL-CONTROL; LINEAR-SYSTEMS; FEEDBACK; DESIGN;
D O I
10.1016/j.neucom.2025.129846
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, the extended Q-learning method is used to study the HPo output tracking control (HOTC) problem of networked control systems with state delay and data loss. Compared with the existing results, the network control system in this paper contains both network delays and packet loss, as well as external disturbances. To deal with the disturbances, the HPo control problem is transformed into the maximum and minimum value problem, which is solved by the method of zero-sum game. The packet loss and delay of the state make it difficult to obtain accurate current state information. Therefore, it is necessary to design a new smith predictor that contains delay and packet loss to predict the current state. Using the predicted state, the extended Qlearning algorithm is implemented to solve the HPo output tracking problem with unknown dynamics of the system. Then, the convergence of the extended Q-learning algorithm is proved. Moreover, the stability and optimality of the proposed method are analyzed in the theorems. Finally, numerical simulation is performed to verify the effectiveness of the proposed algorithm.
引用
收藏
页数:11
相关论文
共 37 条
[1]   Model-free Q-learning designs for linear discrete-time zero-sum games with application to H-infinity control [J].
Al-Tamimi, Asma ;
Lewis, Frank L. ;
Abu-Khalaf, Murad .
AUTOMATICA, 2007, 43 (03) :473-481
[2]  
[Anonymous], 2016, Neurocomputing, V179, P101
[3]  
[Anonymous], 2023, Neurocomputing, V549
[4]   Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design [J].
Bian, Tao ;
Jiang, Zhong-Ping .
AUTOMATICA, 2016, 71 :348-360
[5]   Adaptive PI control for H∞ synchronization of multiple delayed coupled neural networks [J].
Cao, Yuting ;
Zhao, Linhao ;
Zhong, Qishui ;
Zhu, Song ;
Guo, Zhenyuan ;
Wen, Shiping .
NEUROCOMPUTING, 2023, 560
[6]   Robust Output Regulation and Reinforcement Learning-Based Output Tracking Design for Unknown Linear Discrete-Time Systems [J].
Chen, Ci ;
Xie, Lihua ;
Jiang, Yi ;
Xie, Kan ;
Xie, Shengli .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (04) :2391-2398
[7]   Reinforcement Learning-Based Adaptive Optimal Exponential Tracking Control of Linear Systems With Unknown Dynamics [J].
Chen, Ci ;
Modares, Hamidreza ;
Xie, Kan ;
Lewis, Frank L. ;
Wan, Yan ;
Xie, Shengli .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (11) :4423-4438
[8]   Stability of Networked Control Systems With Uncertain Time-Varying Delays [J].
Cloosterman, Marieke B. G. ;
van de Wouw, Nathan ;
Heemels, W. P. M. H. ;
Nijmeijer, Hendrik .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2009, 54 (07) :1575-1580
[9]   Low-complexity learning of Linear Quadratic Regulators from noisy data [J].
De Persis, Claudio ;
Tesi, Pietro .
AUTOMATICA, 2021, 128
[10]   Model-Free Optimal Output Regulation for Linear Discrete-Time Lossy Networked Control Systems [J].
Fan, Jialu ;
Wu, Qian ;
Jiang, Yi ;
Chai, Tianyou ;
Lewis, Frank L. .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11) :4033-4042