Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks

被引：361

作者：

Modares, Hamidreza ^{[1
]}

Lewis, Frank L. ^{[2
]}

Naghibi-Sistani, Mohammad-Bagher ^{[1
]}

机构：

[1] Ferdowsi Univ Mashhad, Dept Elect Engn, Mashhad, Iran

[2] Univ Texas Arlington, Res Inst, Ft Worth, TX 76118 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2013年 / 24卷 / 10期

基金：

美国国家科学基金会;

关键词：

Input constraints; neural networks; optimal control; reinforcement learning; unknown dynamics; CONTINUOUS-TIME;

D O I：

10.1109/TNNLS.2013.2276571

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents an online policy iteration (PI) algorithm to learn the continuous-time optimal control solution for unknown constrained-input systems. The proposed PI algorithm is implemented on an actor-critic structure where two neural networks (NNs) are tuned online and simultaneously to generate the optimal bounded control policy. The requirement of complete knowledge of the system dynamics is obviated by employing a novel NN identifier in conjunction with the actor and critic NNs. It is shown how the identifier weights estimation error affects the convergence of the critic NN. A novel learning rule is developed to guarantee that the identifier weights converge to small neighborhoods of their ideal values exponentially fast. To provide an easy-to-check persistence of excitation condition, the experience replay technique is used. That is, recorded past experiences are used simultaneously with current data for the adaptation of the identifier weights. Stability of the whole system consisting of the actor, critic, system state, and system identifier is guaranteed while all three networks undergo adaptation. Convergence to a near-optimal control law is also shown. The effectiveness of the proposed method is illustrated with a simulation example.

引用

页码：1513 / 1525

页数：13

共 50 条

[41] Adaptive Optimal Control of UAV Formation Based on Policy Iteration
Xu, Guangyan
Zhang, Shugang
Lin, Hao
2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 4145 - 4150
[42] Data-Based Optimal Control for Weakly Coupled Nonlinear Systems Using Policy Iteration
Li, Chao
Liu, Derong
Wang, Ding
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (04): : 511 - 521
[43] Event-triggered adaptive dynamic programming for decentralized tracking control of input constrained unknown nonlinear interconnected systems
Wu, Qiuye
Zhao, Bo
Liu, Derong
Polycarpou, Marios M.
NEURAL NETWORKS, 2023, 157 (336-349) : 336 - 349
[44] Robust optimal control for a class of nonlinear systems with unknown disturbances based on disturbance observer and policy iteration
Song, Ruizhuo
Lewis, Frank L.
NEUROCOMPUTING, 2020, 390 (390) : 185 - 195
[45] Data-Based H∞ Control for the Constrained-Input Nonlinear Systems and its Applications in Chaotic Circuit Systems
Ren, Ling
Zhang, Guoshan
Mu, Chaoxu
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2020, 67 (08) : 2791 - 2802
[46] An optimal control model of neural networks for constrained optimization problems
Song, Q
Leland, RP
OPTIMAL CONTROL APPLICATIONS & METHODS, 1998, 19 (05) : 371 - 376
[47] Adaptive control of linear systems with unknown memoryless input nonlinearities using ANN
Knohl, T
Unbehauen, H
ADAPTIVE SYSTEMS IN CONTROL AND SIGNAL PROCESSING 1998, 2000, : 439 - 444
[48] Adaptive optimal control approach to robust tracking of uncertain linear systems based on policy iteration
Xu, Dengguo
Wang, Qinglin
Li, Yuan
MEASUREMENT & CONTROL, 2021, 54 (5-6) : 668 - 680
[49] Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems
Wei, Qinglai
Liu, Derong
Lin, Hanquan
IEEE TRANSACTIONS ON CYBERNETICS, 2016, 46 (03) : 840 - 853
[50] Adaptive Optimal Control of Linear Periodic Systems: An Off-Policy Value Iteration Approach
Pang, Bo
Jiang, Zhong-Ping
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (02) : 888 - 894

← 1 2 3 4 5 →