Model-Free Optimal Tracking Control of Nonlinear Input-Affine Discrete-Time Systems via an Iterative Deterministic Q-Learning Algorithm

被引:46
作者
Song, Shijie [1 ]
Zhu, Minglei [1 ]
Dai, Xiaolin [1 ]
Gong, Dawei [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu 611731, Peoples R China
基金
芬兰科学院;
关键词
Heuristic algorithms; Q-learning; Nonlinear dynamical systems; Approximation algorithms; Iterative algorithms; Convergence; Artificial neural networks; Adaptive dynamic programming (ADP); neural network (NN); off-policy technique; optimal tracking control (OTC); CONTROL SCHEME; LINEAR-SYSTEMS;
D O I
10.1109/TNNLS.2022.3178746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, a novel model-free dynamic inversion-based Q-learning (DIQL) algorithm is proposed to solve the optimal tracking control (OTC) problem of unknown nonlinear input-affine discrete-time (DT) systems. Compared with the existing DIQL algorithm and the discount factor-based Q-learning (DFQL) algorithm, the proposed algorithm can eliminate the tracking error while ensuring that it is model-free and off-policy. First, a new deterministic Q-learning iterative scheme is presented, and based on this scheme, a model-based off-policy DIQL algorithm is designed. The advantage of this new scheme is that it can avoid the training of unusual data and improve data utilization, thereby saving computing resources. Simultaneously, the convergence and stability of the designed algorithm are analyzed, and the proof that adding probing noise into the behavior policy does not affect the convergence is presented. Then, by introducing neural networks (NNs), the model-free version of the designed algorithm is further proposed so that the OTC problem can be solved without any knowledge about the system dynamics. Finally, three simulation examples are given to demonstrate the effectiveness of the proposed algorithm.
引用
收藏
页码:999 / 1012
页数:14
相关论文
共 50 条
[41]   Data-Based Optimal Tracking Control of Nonaffine Nonlinear Discrete-Time Systems [J].
Luo, Biao ;
Liu, Derong ;
Huang, Tingwen ;
Li, Chao .
NEURAL INFORMATION PROCESSING, ICONIP 2016, PT IV, 2016, 9950 :573-581
[42]   Q-learning for continuous-time linear systems: A model-free infinite horizon optimal control approach [J].
Vamvoudakis, Kyriakos G. .
SYSTEMS & CONTROL LETTERS, 2017, 100 :14-20
[43]   Optimal trajectory tracking for uncertain linear discrete-time systems using time-varying Q-learning [J].
Geiger, Maxwell ;
Narayanan, Vignesh ;
Jagannathan, Sarangapani .
INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2024, 38 (07) :2340-2368
[44]   Model-Free Adaptive Control for Unknown MIMO Nonaffine Nonlinear Discrete-Time Systems With Experimental Validation [J].
Xiong, Shuangshuang ;
Hou, Zhongsheng .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (04) :1727-1739
[45]   Error-Based Model-Free Adaptive Performance Tuning Control With Disturbance Rejection for Discrete-Time Nonlinear Systems [J].
Cheng, Yun ;
Chen, Qiang ;
Hu, Shuangyi ;
Ren, Xuemei ;
Yang, Mingyu ;
He, Xiongxiong .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2025,
[46]   Reinforcement Q-learning and Optimal Tracking Control of Unknown Discrete-time Multi-player Systems Based on Game Theory [J].
Zhao, Jin-Gang .
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2024, 22 (05) :1751-1759
[47]   The Adaptive Optimal Output Feedback Tracking Control of Unknown Discrete-Time Linear Systems Using a Multistep Q-Learning Approach [J].
Dong, Xunde ;
Lin, Yuxin ;
Suo, Xudong ;
Wang, Xihao ;
Sun, Weijie .
MATHEMATICS, 2024, 12 (04)
[48]   HPo tracking control for perturbed discrete-time systems using On/Off policy Q-learning algorithms [J].
Dao, Phuong Nam ;
Dao, Quang Huy .
CHAOS SOLITONS & FRACTALS, 2025, 197
[49]   Dual-staged optimal iterative learning control for a class of nonlinear discrete-time systems [J].
Chi Rong-hu ;
Hou Zhong-sheng .
Proceedings of 2005 Chinese Control and Decision Conference, Vols 1 and 2, 2005, :852-856
[50]   Optimal control for unknown mean-field discrete-time system based on Q-Learning [J].
Ge, Yingying ;
Liu, Xikui ;
Li, Yan .
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2021, 52 (15) :3335-3349