Neural network-based event-triggered integral reinforcement learning for constrained H1 tracking control with experience replay

被引：14

作者：

Xue, Shan ^{[1
,2
]}

Luo, Biao ^{[3
]}

Liu, Derong ^{[4
]}

Gao, Ying ^{[1
]}

机构：

[1] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China

[2] Peng Cheng Lab, Shenzhen 518000, Peoples R China

[3] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China

[4] Univ Illinois, Dept Elect & Comp Engn, Chicago, IL 60607 USA

来源：

NEUROCOMPUTING | 2022年 / 513卷

关键词：

Adaptive dynamic programming; Neural networks; Integral reinforcement learning; H 1 tracking control; Event -triggered mechanism; UNCERTAIN NONLINEAR-SYSTEMS; FIXED-TIME CONSENSUS; POLICY ITERATION; FEEDBACK-CONTROL; ALGORITHM; DESIGN;

D O I：

10.1016/j.neucom.2022.09.119

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Since input constraints and external disturbances are unavoidable in tracking control problems, how to obtain a controller in this case to save communication and data resources at the same time is very chal-lenging. Aiming at these challenges, this paper develops a novel neural network (NN)-based event -triggered integral reinforcement learning (IRL) algorithm for constrained H1 tracking control problems. First, the constrained H1 tracking control problem is transformed into a regulation problem. Second, an event-triggered optimal controller is designed to reduce network transmission burden and improve resource utilization, where a novel threshold is proposed and its non-negativity can be guaranteed. Third, for implementation purpose, a novel NN-based event-triggered IRL algorithm is developed. In order to improve data utilization, the experience replay technique with an easy-to-verify condition is employed in the learning process. Theoretical analysis proves that the tracking error and weight estima-tion error are uniformly ultimately bounded. Finally, simulation verification shows the effectiveness of the present method. (c) 2022 Elsevier B.V. All rights reserved.

引用

页码：25 / 35

页数：11

共 62 条

[21] Dynamic Event-Triggered Practical Fixed-Time Consensus for Nonlinear Multiagent Systems [J].

Liu, Jian ;

Ran, Guangtao ;

Wu, Yongbao ;

Xue, Lei ;

Sun, Changyin .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2022, 69 (04) :2156-2160

[22] Fixed-time consensus of multi-agent systems with input delay and uncertain disturbances via event-triggered control [J].

Liu, Jian ;

Zhang, Yanling ;

Sun, Changyin ;

Yu, Yao .

INFORMATION SCIENCES, 2019, 480 :261-272

[23] Event-Triggered Optimal Control With Performance Guarantees Using Adaptive Dynamic Programming [J].

Luo, Biao ;

Yang, Yin ;

Liu, Derong ;

Wu, Huai-Ning .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (01) :76-88

[24] Adaptive Q-Learning for Data-Based Optimal Output Regulation With Experience Replay [J].

Luo, Biao ;

Yang, Yin ;

Liu, Derong .

IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (12) :3337-3348

[25] Adaptive Constrained Optimal Control Design for Data-Based Nonlinear Discrete-Time Systems With Critic-Only Structure [J].

Luo, Biao ;

Liu, Derong ;

Wu, Huai-Ning .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) :2099-2111

[26] Policy Gradient Adaptive Dynamic Programming for Data-Based Optimal Control [J].

Luo, Biao ;

Liu, Derong ;

Wu, Huai-Ning ;

Wang, Ding ;

Lewis, Frank L. .

IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (10) :3341-3354

[27] Data-based approximate policy iteration for affine nonlinear continuous-time optimal control design [J].

Luo, Biao ;

Wu, Huai-Ning ;

Huang, Tingwen ;

Liu, Derong .

AUTOMATICA, 2014, 50 (12) :3281-3290

[28] H∞ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning [J].

Modares, Hamidreza ;

Lewis, Frank L. ;

Jiang, Zhong-Ping .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (10) :2550-2562

[29] Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning [J].

Modares, Hamidreza ;

Lewis, Frank L. .

AUTOMATICA, 2014, 50 (07) :1780-1792

[30] Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems [J].

Modares, Hamidreza ;

Lewis, Frank L. ;

Naghibi-Sistani, Mohammad-Bagher .

AUTOMATICA, 2014, 50 (01) :193-202

← 1 2 3 4 5 6 7 →