Q-learning based tracking control with novel finite-horizon performance index ☆

被引:0
作者
Wang, Wei [1 ,2 ,3 ]
Wang, Ke [1 ]
Huang, Zixin [4 ]
Mu, Chaoxu [1 ]
Shi, Haoxian [5 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Zhongnan Univ Econ & Law, Sch Informat Engn, Wuhan 430073, Peoples R China
[3] Zhongnan Univ Econ & Law, Emergency Management Res Ctr, Wuhan 430073, Peoples R China
[4] Wuhan Inst Technol, Sch Elect & Informat Engn, Wuhan 430205, Peoples R China
[5] China Geol Survey, Guangzhou Marine Geol Survey, Guangzhou 510075, Peoples R China
关键词
Optimal tracking control; Model-free control; Q-function; Finite-horizon; NONLINEAR-SYSTEMS; TIME-SYSTEMS;
D O I
10.1016/j.ins.2024.121212
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A data-driven method is designed to realize the model-free finite-horizon optimal tracking control (FHOTC) of unknown linear discrete-time systems based on Q-learning in this paper. First, a novel finite-horizon performance index (FHPI) that only depends on the next-step tracking error is introduced. Then, an augmented system is formulated, which incorporates with the system model and the trajectory model. Based on the novel FHPI, a derivation of the augmented time-varying Riccati equation (ATVRE) is provided. We present a data-driven FHOTC method that uses Qlearning to optimize the defined time-varying Q-function. This allows us to estimate the solutions of the ATVRE without the system dynamics. Finally, the validity and features of the proposed Qlearning-based FHOTC method are demonstrated by means of conducting comparative simulation studies.
引用
收藏
页数:10
相关论文
共 50 条
[21]   Model-Free Optimal Tracking Control via Critic-Only Q-Learning [J].
Luo, Biao ;
Liu, Derong ;
Huang, Tingwen ;
Wang, Ding .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (10) :2134-2144
[22]   Adaptive periodic event-triggered control for missile-target interception system with finite-horizon convergence [J].
Duan, Dandan ;
Liu, Chunsheng ;
Sun, Jingliang .
TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2020, 42 (10) :1808-1822
[23]   Successive Radial Basis Function Approximation for Finite-Horizon Nonlinear H∞ Control Problems [J].
Wang, Zhong ;
Liu, Yuxuan ;
Lang, Jinxi ;
Li, Yan .
IEEE CONTROL SYSTEMS LETTERS, 2025, 9 :1045-1050
[24]   Finite-Horizon Optimal Consensus Control for Unknown Multiagent State-Delay Systems [J].
Zhang, Huaipin ;
Park, Ju H. ;
Yue, Dong ;
Xie, Xiangpeng .
IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (02) :402-413
[25]   Evolution-guided Q-learning for tracking control of unknown dynamic systems [J].
Yuan, Zeqiang ;
Wang, Ding ;
Wang, Jiangyu ;
Zhao, Mingming ;
Qiao, Junfei .
NEUROCOMPUTING, 2025, 640
[26]   Reinforcement Q-Learning for PDF Tracking Control of Stochastic Systems with Unknown Dynamics [J].
Yang, Weiqing ;
Zhou, Yuyang ;
Zhang, Yong ;
Ren, Yan .
MATHEMATICS, 2024, 12 (16)
[27]   Adjustable iterative Q-learning for advanced neural tracking control with stability guarantee [J].
Wang, Yuan ;
Wang, Ding ;
Zhao, Mingming ;
Liu, Ao ;
Qiao, Junfei .
NEUROCOMPUTING, 2024, 584
[28]   Stable approximate Q-learning under discounted cost for data-based adaptive tracking control [J].
Liang, Zhantao ;
Ha, Mingming ;
Liu, Derong ;
Wang, Yonghua .
NEUROCOMPUTING, 2024, 568
[29]   Signal reconstruction in the presence of finite-rate measurements: finite-horizon control applications [J].
Sarma, Sridevi V. ;
Dahleh, Munther A. .
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2010, 20 (01) :41-58
[30]   Optimal finite-horizon production control in a defect-prone environment [J].
Kogan, K ;
Shu, C ;
Perkins, JR .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2004, 49 (10) :1795-1800