HPo tracking control for perturbed discrete-time systems using On/Off policy Q-learning algorithms

被引：1

作者：

Dao, Phuong Nam ^{[1
]}

Dao, Quang Huy ^{[1
]}

机构：

[1] Hanoi Univ Sci & Technol, Sch Elect & Elect Engn, Hanoi, Vietnam

来源：

CHAOS SOLITONS & FRACTALS | 2025年 / 197卷

关键词：

Perturbed discrete-time systems; Q-learning; On/off policy algorithm; Model-free control; Reinforcement learning control; ADAPTIVE OPTIMAL-CONTROL;

D O I：

10.1016/j.chaos.2025.116459

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

The widely studied HPo zero-sum game problem guarantees the integration of external disturbance into the optimal control problem. In this article, two model-free Q-learning algorithms based on HPo tracking control are proposed for perturbed discrete-time systems in the presence of external disturbance. Moreover, modification of the output optimal control problem is also made. For the optimal tracking control problem, the existence of a discount factor is necessary to guarantee the final value of the cost function, and the Ricatti equation is modified. With the aid of the deviation between Q functions at two consecutive times and the original principle of Off/On policy, the consideration of HPo zero-sum game problem, two On/Off Q-learning algorithms based on HPo tracking control are proposed. Then, by computing the Q function, the influence of probing noise on the Q function is considered. The analysis of solution equivalence proves that convergence and tracking are guaranteed in the proposed algorithm. Eventually, simulation studies are carried out on F-16 aircraft to assess the validity of the presented control schemes.

引用

收藏

页数：20

相关论文

共 46 条

[11] Off-policy two-dimensional reinforcement learning for optimal tracking control of batch processes with network-induced dropout and disturbances [J].

Jiang, Xueying ;

Huang, Min ;

Shi, Huiyuan ;

Wang, Xingwei ;

Zhang, Yanfeng .

ISA TRANSACTIONS, 2024, 144 :228-244

[12] Reinforcement learning and cooperative H? output regulation of linear continuous-time multi-agent systems [J].

Jiang, Yi ;

Gao, Weinan ;

Wu, Jin ;

Chai, Tianyou ;

Lewis, Frank L. .

AUTOMATICA, 2023, 148

[13] H∞ control of linear discrete-time systems: Off-policy reinforcement learning [J].

Kiumarsi, Bahare ;

Lewis, Frank L. ;

Jiang, Zhong-Ping .

AUTOMATICA, 2017, 78 :144-152

[14] Optimal dynamic Control Allocation with guaranteed constraints and online Reinforcement Learning [J].

Kolaric, Patrik ;

Lopez, Victor G. ;

Lewis, Frank L. .

AUTOMATICA, 2020, 122

[15] Compensator-Based Self-Learning: Optimal Operational Control for Two-Time-Scale Systems With Input Constraints [J].

Li, Jinna ;

Yang, Mingwei ;

Lewis, Frank L. ;

Zheng, Meng .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (07) :9465-9475

[16] NN-Based Reinforcement Learning Optimal Control for Inequality-Constrained Nonlinear Discrete-Time Systems With Disturbances [J].

Li, Shu ;

Ding, Liang ;

Zheng, Miao ;

Liu, Zixuan ;

Li, Xinyu ;

Yang, Huaiguang ;

Gao, Haibo ;

Deng, Zongquan .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (11) :15507-15516

[17] Static-Output-Feedback Based Robust Fuzzy Wheelbase Preview Control for Uncertain Active Suspensions With Time Delay and Finite Frequency Constraint [J].

Li, Wenfeng ;

Xie, Zhengchao ;

Zhao, Jing ;

Wong, Pak Kin ;

Wang, Hui ;

Wang, Xiaowei .

IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2021, 8 (03) :664-678

[18] Policy Optimization Adaptive Dynamic Programming for Optimal Control of Input-Affine Discrete-Time Nonlinear Systems [J].

Lin, Mingduo ;

Zhao, Bo .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (07) :4339-4350

[19] Finite-Horizon Optimal Control for Nonlinear Multi-Input Systems With Online Adaptive Integral Reinforcement Learning [J].

Lv, Yongfeng ;

Zhang, Wan ;

Zhao, Jun ;

Zhao, Xiaowei .

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2025, 22 :802-812

[20] Robust hierarchical games of linear discrete-time systems based on off-policy model-free reinforcement learning [J].

Ma, Xiao ;

Yuan, Yuan .

JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2024, 361 (07)

← 1 2 3 4 5 →