Q-learning based tracking control with novel finite-horizon performance index ☆

被引：0

作者：

Wang, Wei ^{[1
,2
,3
]}

Wang, Ke ^{[1
]}

Huang, Zixin ^{[4
]}

Mu, Chaoxu ^{[1
]}

Shi, Haoxian ^{[5
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Zhongnan Univ Econ & Law, Sch Informat Engn, Wuhan 430073, Peoples R China

[3] Zhongnan Univ Econ & Law, Emergency Management Res Ctr, Wuhan 430073, Peoples R China

[4] Wuhan Inst Technol, Sch Elect & Informat Engn, Wuhan 430205, Peoples R China

[5] China Geol Survey, Guangzhou Marine Geol Survey, Guangzhou 510075, Peoples R China

来源：

INFORMATION SCIENCES | 2024年 / 681卷

关键词：

Optimal tracking control; Model-free control; Q-function; Finite-horizon; NONLINEAR-SYSTEMS; TIME-SYSTEMS;

D O I：

10.1016/j.ins.2024.121212

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

A data-driven method is designed to realize the model-free finite-horizon optimal tracking control (FHOTC) of unknown linear discrete-time systems based on Q-learning in this paper. First, a novel finite-horizon performance index (FHPI) that only depends on the next-step tracking error is introduced. Then, an augmented system is formulated, which incorporates with the system model and the trajectory model. Based on the novel FHPI, a derivation of the augmented time-varying Riccati equation (ATVRE) is provided. We present a data-driven FHOTC method that uses Qlearning to optimize the defined time-varying Q-function. This allows us to estimate the solutions of the ATVRE without the system dynamics. Finally, the validity and features of the proposed Qlearning-based FHOTC method are demonstrated by means of conducting comparative simulation studies.

引用

页数：10

共 50 条

[41] Optimal finite-horizon tracking control in affine nonlinear systems: A Stackelberg game approach with H2/H∞ framework [J].

Dong, Xu ;

Zhang, Huaguang ;

Ming, Zhongyang ;

Luo, Yanhong .

APPLIED MATHEMATICS AND COMPUTATION, 2025, 495

[42] Novel data-driven two-dimensional Q-learning for optimal tracking control of batch process with unknown dynamics [J].

Wen, Xin ;

Shi, Huiyuan ;

Su, Chengli ;

Jiang, Xueying ;

Li, Ping ;

Yu, Jingxian .

ISA TRANSACTIONS, 2022, 125 :10-21

[43] Stochastic linear quadratic optimal tracking control for discrete-time systems with delays based on Q-learning algorithm [J].

Tan, Xufeng ;

Li, Yuan ;

Liu, Yang .

AIMS MATHEMATICS, 2023, 8 (05) :10249-10265

[44] Nonlinear neuro-optimal tracking control via stable iterative Q-learning algorithm [J].

Wei, Qinglai ;

Song, Ruizhuo ;

Sun, Qiuye .

NEUROCOMPUTING, 2015, 168 :520-528

[45] Discrete-Time Deterministic Q-Learning: A Novel Convergence Analysis [J].

Wei, Qinglai ;

Lewis, Frank L. ;

Sun, Qiuye ;

Yan, Pengfei ;

Song, Ruizhuo .

IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (05) :1224-1237

[46] Finding initial costates in finite-horizon nonlinear-quadratic optimal control problems [J].

Costanza, Vicente .

OPTIMAL CONTROL APPLICATIONS & METHODS, 2008, 29 (03) :225-242

[47] Adaptive Critics Design with Support Vector Machine for Spacecraft Finite-Horizon Optimal Control [J].

Kim, Yunjoong ;

Kim, Youdan ;

Park, Chandeok .

JOURNAL OF AEROSPACE ENGINEERING, 2019, 32 (01)

[48] Finite Horizon Robust Optimal Tracking Control Based on Approximate Dynamic Programming for Switched Systems with Uncertainties [J].

Zhao, Shangwei ;

Wang, Jingcheng ;

Xu, Haotian ;

Wang, Hongyuan .

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2022, 20 (04) :1051-1062

[49] Finite-Horizon Optimal Control Design for Uncertain Linear Discrete-time Systems [J].

Zhao, Qiming ;

Xu, Hao ;

Jagannathan, S. .

PROCEEDINGS OF THE 2013 IEEE SYMPOSIUM ON ADAPTIVE DYNAMIC PROGRAMMING AND REINFORCEMENT LEARNING (ADPRL), 2013, :6-12

[50] The finite-horizon optimal control for a class of time-delay affine nonlinear system [J].

Ruizhuo Song ;

Huaguang Zhang .

Neural Computing and Applications, 2013, 22 :229-235

← 1 2 3 4 5 →