Model-free extended Q-learning method for H∞, output tracking control of networked control systems with network delays and packet loss

被引:0
作者
Hao, Longyan [1 ]
Wang, Chaoli [1 ]
Liang, Dong [1 ]
Li, Shihua [2 ]
机构
[1] Univ Shanghai Sci & Technol, Dept Control Sci & Engn, Shanghai 200093, Peoples R China
[2] Southeast Univ, Sch Automat, Nanjing 211189, Peoples R China
基金
中国国家自然科学基金;
关键词
Q-learning algorithm; Networked control systems; H(infinity)output tracking control; ZERO-SUM GAMES; DISCRETE-TIME-SYSTEMS; ADAPTIVE OPTIMAL-CONTROL; LINEAR-SYSTEMS; FEEDBACK; DESIGN;
D O I
10.1016/j.neucom.2025.129846
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, the extended Q-learning method is used to study the HPo output tracking control (HOTC) problem of networked control systems with state delay and data loss. Compared with the existing results, the network control system in this paper contains both network delays and packet loss, as well as external disturbances. To deal with the disturbances, the HPo control problem is transformed into the maximum and minimum value problem, which is solved by the method of zero-sum game. The packet loss and delay of the state make it difficult to obtain accurate current state information. Therefore, it is necessary to design a new smith predictor that contains delay and packet loss to predict the current state. Using the predicted state, the extended Qlearning algorithm is implemented to solve the HPo output tracking problem with unknown dynamics of the system. Then, the convergence of the extended Q-learning algorithm is proved. Moreover, the stability and optimality of the proposed method are analyzed in the theorems. Finally, numerical simulation is performed to verify the effectiveness of the proposed algorithm.
引用
收藏
页数:11
相关论文
共 39 条
[1]  
Zames G., Feedback and optimal sensitivity: Model reference transformation, multiplicative seminorms, and approximates inverses, IEEE Trans. Autom. Control, 26, 2, pp. 301-320, (1981)
[2]  
Al-Tamimi A., Lewis F.L., Abu-Khalaf M., Model-free Q-learning designs for linear discrete-time zero-sum games with application to H∞ control, Automatica, 43, 3, pp. 473-481, (2007)
[3]  
Bian T., Jiang Z.-P., Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design, Automatica, 71, pp. 348-360, (2016)
[4]  
Asad S.A., Rizvi Z.-P., Zongli M., Lin, Output feedback Q-learning control for the discrete-time linear quadratic regulator problem, IEEE Trans. Neural Netw. Learn. Syst., 30, 5, pp. 1523-1536, (2018)
[5]  
Jiang Y., Jiang Z.-P., Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics, Automatica, 48, 10, pp. 2699-2704, (2012)
[6]  
Vrabie D., Pastravanu O., Abu-Khalaf M., Lewis F., Adaptive optimal control for continuous-time linear systems based on policy iteration, Automatica, 45, 2, pp. 477-484, (2009)
[7]  
Modares H., Lewis F.L., Jiang Z.-P., Optimal output-feedback control of unknown continuous-time linear systems using off-policy reinforcement learning, IEEE Trans. Cybern., 46, 11, pp. 2401-2410, (2016)
[8]  
Rizvi S.A.A., Lin Z., Output feedback Q-learning control for the discrete-time linear quadratic regulator problem, IEEE Trans. Neural Netw. Learn. Syst., 30, 5, pp. 1523-1536, (2019)
[9]  
Jiang Y., Kiumarsi B., Fan J., Chai T., Li J., Lewis F.L., Optimal output regulation of linear discrete-time systems with unknown dynamics using reinforcement learning, IEEE Trans. Cybern., pp. 1-10, (2019)
[10]  
Chen C., Modares H., Xie K., Lewis F.L., Wan Y., Xie S., Reinforcement learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics, IEEE Trans. Autom. Control, 64, 11, pp. 4423-4438, (2019)