Model-Free Optimal Tracking Control of Nonlinear Input-Affine Discrete-Time Systems via an Iterative Deterministic Q-Learning Algorithm

被引：46

作者：

Song, Shijie ^{[1
]}

Zhu, Minglei ^{[1
]}

Dai, Xiaolin ^{[1
]}

Gong, Dawei ^{[1
]}

机构：

[1] Univ Elect Sci & Technol China, Sch Mech & Elect Engn, Chengdu 611731, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2024年 / 35卷 / 01期

基金：

芬兰科学院;

关键词：

Heuristic algorithms; Q-learning; Nonlinear dynamical systems; Approximation algorithms; Iterative algorithms; Convergence; Artificial neural networks; Adaptive dynamic programming (ADP); neural network (NN); off-policy technique; optimal tracking control (OTC); CONTROL SCHEME; LINEAR-SYSTEMS;

D O I：

10.1109/TNNLS.2022.3178746

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this article, a novel model-free dynamic inversion-based Q-learning (DIQL) algorithm is proposed to solve the optimal tracking control (OTC) problem of unknown nonlinear input-affine discrete-time (DT) systems. Compared with the existing DIQL algorithm and the discount factor-based Q-learning (DFQL) algorithm, the proposed algorithm can eliminate the tracking error while ensuring that it is model-free and off-policy. First, a new deterministic Q-learning iterative scheme is presented, and based on this scheme, a model-based off-policy DIQL algorithm is designed. The advantage of this new scheme is that it can avoid the training of unusual data and improve data utilization, thereby saving computing resources. Simultaneously, the convergence and stability of the designed algorithm are analyzed, and the proof that adding probing noise into the behavior policy does not affect the convergence is presented. Then, by introducing neural networks (NNs), the model-free version of the designed algorithm is further proposed so that the OTC problem can be solved without any knowledge about the system dynamics. Finally, three simulation examples are given to demonstrate the effectiveness of the proposed algorithm.

引用

页码：999 / 1012

页数：14

共 50 条

[11] Model-Free Optimal Tracking Control via Critic-Only Q-Learning [J].

Luo, Biao ;

Liu, Derong ;

Huang, Tingwen ;

Wang, Ding .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (10) :2134-2144

[12] Model-free H∞ control design for unknown linear discrete-time systems via Q-learning with LMI [J].

Kim, J. -H. ;

Lewis, F. L. .

AUTOMATICA, 2010, 46 (08) :1320-1326

[13] Model-Free Algorithms for Containment Control of Saturated Discrete-Time Multiagent Systems via Q-Learning Method [J].

Long, Mingkang ;

Su, Housheng ;

Zeng, Zhigang .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (02) :1308-1316

[14] Nonlinear neuro-optimal tracking control via stable iterative Q-learning algorithm [J].

Wei, Qinglai ;

Song, Ruizhuo ;

Sun, Qiuye .

NEUROCOMPUTING, 2015, 168 :520-528

[15] A novel policy iteration based deterministic Q-learning for discrete-time nonlinear systems [J].

Wei QingLai ;

Liu DeRong .

SCIENCE CHINA-INFORMATION SCIENCES, 2015, 58 (12) :1-15

[16] Adaptive Q-Learning Based Model-Free H∞ Control of Continuous-Time Nonlinear Systems: Theory and Application [J].

Zhao, Jun ;

Lv, Yongfeng ;

Wang, Zhangu ;

Zhao, Ziliang .

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (02) :1143-1152

[17] Model-free finite-time H2/H∞ predictive control for discrete-time systems via Q-learning [J].

Lin, Yihong ;

Wan, Haiying ;

He, Peng ;

Luan, Xiaoli ;

Liu, Fei .

INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2025,

[18] Optimal tracking control for discrete-time modal persistent dwell time switched systems based on Q-learning [J].

Zhang, Xuewen ;

Wang, Yun ;

Xia, Jianwei ;

Li, Feng ;

Shen, Hao .

OPTIMAL CONTROL APPLICATIONS & METHODS, 2023, 44 (06) :3327-3341

[19] Compact Model-Free Adaptive Control Algorithm for Discrete-Time Nonlinear Systems [J].

Zhang, Xiaofei ;

Ma, Hongbin ;

Zhang, Xinghong ;

Li, You .

IEEE ACCESS, 2019, 7 :141062-141071

[20] Adaptive Optimal Control via Continuous-Time Q-Learning for Unknown Nonlinear Affine Systems [J].

Chen, Anthony Siming ;

Herrmann, Guido .

2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, :1007-1012

← 1 2 3 4 5 →