Q-learning based tracking control with novel finite-horizon performance index ☆

被引:0
作者
Wang, Wei [1 ,2 ,3 ]
Wang, Ke [1 ]
Huang, Zixin [4 ]
Mu, Chaoxu [1 ]
Shi, Haoxian [5 ]
机构
[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China
[2] Zhongnan Univ Econ & Law, Sch Informat Engn, Wuhan 430073, Peoples R China
[3] Zhongnan Univ Econ & Law, Emergency Management Res Ctr, Wuhan 430073, Peoples R China
[4] Wuhan Inst Technol, Sch Elect & Informat Engn, Wuhan 430205, Peoples R China
[5] China Geol Survey, Guangzhou Marine Geol Survey, Guangzhou 510075, Peoples R China
关键词
Optimal tracking control; Model-free control; Q-function; Finite-horizon; NONLINEAR-SYSTEMS; TIME-SYSTEMS;
D O I
10.1016/j.ins.2024.121212
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A data-driven method is designed to realize the model-free finite-horizon optimal tracking control (FHOTC) of unknown linear discrete-time systems based on Q-learning in this paper. First, a novel finite-horizon performance index (FHPI) that only depends on the next-step tracking error is introduced. Then, an augmented system is formulated, which incorporates with the system model and the trajectory model. Based on the novel FHPI, a derivation of the augmented time-varying Riccati equation (ATVRE) is provided. We present a data-driven FHOTC method that uses Qlearning to optimize the defined time-varying Q-function. This allows us to estimate the solutions of the ATVRE without the system dynamics. Finally, the validity and features of the proposed Qlearning-based FHOTC method are demonstrated by means of conducting comparative simulation studies.
引用
收藏
页数:10
相关论文
共 50 条
[41]   Optimal finite-horizon tracking control in affine nonlinear systems: A Stackelberg game approach with H2/H∞ framework [J].
Dong, Xu ;
Zhang, Huaguang ;
Ming, Zhongyang ;
Luo, Yanhong .
APPLIED MATHEMATICS AND COMPUTATION, 2025, 495
[42]   Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics [J].
Wen, Xin ;
Shi, Huiyuan ;
Su, Chengli ;
Jiang, Xueying ;
Li, Ping ;
Yu, Jingxian .
ISA TRANSACTIONS, 2022, 125 :10-21
[43]   Stochastic linear quadratic optimal tracking control for discrete-time systems with delays based on Q-learning algorithm [J].
Tan, Xufeng ;
Li, Yuan ;
Liu, Yang .
AIMS MATHEMATICS, 2023, 8 (05) :10249-10265
[44]   Nonlinear neuro-optimal tracking control via stable iterative Q-learning algorithm [J].
Wei, Qinglai ;
Song, Ruizhuo ;
Sun, Qiuye .
NEUROCOMPUTING, 2015, 168 :520-528
[45]   Discrete-Time Deterministic Q-Learning: A Novel Convergence Analysis [J].
Wei, Qinglai ;
Lewis, Frank L. ;
Sun, Qiuye ;
Yan, Pengfei ;
Song, Ruizhuo .
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (05) :1224-1237
[46]   Finding initial costates in finite-horizon nonlinear-quadratic optimal control problems [J].
Costanza, Vicente .
OPTIMAL CONTROL APPLICATIONS & METHODS, 2008, 29 (03) :225-242
[47]   Adaptive Critics Design with Support Vector Machine for Spacecraft Finite-Horizon Optimal Control [J].
Kim, Yunjoong ;
Kim, Youdan ;
Park, Chandeok .
JOURNAL OF AEROSPACE ENGINEERING, 2019, 32 (01)
[48]   Finite Horizon Robust Optimal Tracking Control Based on Approximate Dynamic Programming for Switched Systems with Uncertainties [J].
Zhao, Shangwei ;
Wang, Jingcheng ;
Xu, Haotian ;
Wang, Hongyuan .
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2022, 20 (04) :1051-1062
[49]   Finite-Horizon Optimal Control Design for Uncertain Linear Discrete-time Systems [J].
Zhao, Qiming ;
Xu, Hao ;
Jagannathan, S. .
PROCEEDINGS OF THE 2013 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2013, :6-12
[50]   The finite-horizon optimal control for a class of time-delay affine nonlinear system [J].
Ruizhuo Song ;
Huaguang Zhang .
Neural Computing and Applications, 2013, 22 :229-235