Dynamic compensator-based near-optimal control for unknown nonaffine systems via integral reinforcement learning

被引：8

作者：

Lin, Jinquan ^{[1
]}

Zhao, Bo ^{[2
]}

Liu, Derong ^{[3
,4
]}

Wang, Yonghua ^{[1
]}

机构：

[1] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China

[2] Beijing Normal Univ, Sch Syst Sci, Beijing 100875, Peoples R China

[3] Southern Univ Sci & Technol, Sch Syst Design & Intelligent Mfg, Shenzhen 518055, Peoples R China

[4] Univ Illinois, Dept Elect & Comp Engn, Chicago, IL 60607 USA

来源：

NEUROCOMPUTING | 2024年 / 564卷

基金：

中国国家自然科学基金; 北京市自然科学基金;

关键词：

Neuro-dynamic programming; Adaptive dynamic programming; Reinforcement learning; Optimal control; Neural networks; Dynamic compensator; CONTINUOUS-TIME; EXPERIENCE REPLAY; DESIGN; ALGORITHM;

D O I：

10.1016/j.neucom.2023.126973

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, a dynamic compensator-based near-optimal control approach for unknown nonaffine nonlinear systems is developed by using integral reinforcement learning. Since system dynamics is unknown, it is difficult to obtain the optimal control policy via neuro-dynamic programming. To address this problem, a general dynamic compensator is introduced as the virtual control input to augment the unknown nonaffine nonlinear system as a partially unknown affine system. For the augmented system, a novel quadratic value function is designed with the system states, the actual control input and the virtual control input. The optimal control of the augmented system can be regarded as the near-optimal control for the original system since the novel optimal value function is an upper bound of the original optimal value function. In order to avoid the identification of system dynamics, the integral reinforcement learning framework is utilized to derive the optimal control based on the solution of Hamilton-Jacobi-Bellman equation via the critic-only structure. Meanwhile, the weight learning rule of the critic neural network is presented with the experience replay technique to relax the persistence of excitation condition. Moreover, the uniform ultimate boundedness of weight estimation errors and the stability of the closed-loop system are guaranteed by using the Lyapunov's direct method. Finally, simulation results of two examples demonstrate the effectiveness of the developed dynamic compensator-based near-optimal control method.

引用

页数：9

共 42 条

[1] Bounded robust control of nonlinear systems using neural network-based HJB solution
Adhyaru, Dipak M.
Kar, I. N.
Gopal, M.
[J]. NEURAL COMPUTING & APPLICATIONS, 2011, 20 (01) : 91 - 103
[2] Cox C, 1999, INT J ROBUST NONLIN, V9, P1071, DOI 10.1002/(SICI)1099-1239(19991215)9:14<1071::AID-RNC453>3.0.CO
[3] 2-W
[4] Reinforcement learning in continuous time and space
Doya, K
[J]. NEURAL COMPUTATION, 2000, 12 (01) : 219 - 245
[5] Hanbing Dan, 2021, CES Transactions on Electrical Machines and Systems, V5, P90, DOI 10.30941/CESTEMS.2021.00012
[6] H-infinity Control of Nonaffine Aerial Systems Using Off-policy Reinforcement
Kiumarsi, Bahare
Kang, Wei
Lewis, Frank L.
[J]. UNMANNED SYSTEMS, 2016, 4 (01) : 51 - 60
[7] Integral Reinforcement Learning for Linear Continuous-Time Zero-Sum Games With Completely Unknown Dynamics
Li, Hongliang
Liu, Derong
Wang, Ding
[J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2014, 11 (03) : 706 - 714
[8] Off-Policy Q-Learning: Set-Point Design for Optimizing Dual-Rate Rougher Flotation Operational Processes
Li, Jinna
Chai, Tianyou
Lewis, Frank L.
Fan, Jialu
Ding, Zhengtao
Ding, Jinliang
[J]. IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2018, 65 (05) : 4092 - 4102
[9] Robust Output Regulation of Linear Systems by Event-Triggered Dynamic Output Feedback Control
Liang, Dong
Huang, Jie
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (05) : 2415 - 2422
[10] Policy Gradient Adaptive Critic Designs for Model-Free Optimal Tracking Control With Experience Replay
Lin, Mingduo
Zhao, Bo
Liu, Derong
[J]. IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (06): : 3692 - 3703

← 1 2 3 4 5 →