Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks

被引:361
作者
Modares, Hamidreza [1 ]
Lewis, Frank L. [2 ]
Naghibi-Sistani, Mohammad-Bagher [1 ]
机构
[1] Ferdowsi Univ Mashhad, Dept Elect Engn, Mashhad, Iran
[2] Univ Texas Arlington, Res Inst, Ft Worth, TX 76118 USA
基金
美国国家科学基金会;
关键词
Input constraints; neural networks; optimal control; reinforcement learning; unknown dynamics; CONTINUOUS-TIME;
D O I
10.1109/TNNLS.2013.2276571
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an online policy iteration (PI) algorithm to learn the continuous-time optimal control solution for unknown constrained-input systems. The proposed PI algorithm is implemented on an actor-critic structure where two neural networks (NNs) are tuned online and simultaneously to generate the optimal bounded control policy. The requirement of complete knowledge of the system dynamics is obviated by employing a novel NN identifier in conjunction with the actor and critic NNs. It is shown how the identifier weights estimation error affects the convergence of the critic NN. A novel learning rule is developed to guarantee that the identifier weights converge to small neighborhoods of their ideal values exponentially fast. To provide an easy-to-check persistence of excitation condition, the experience replay technique is used. That is, recorded past experiences are used simultaneously with current data for the adaptation of the identifier weights. Stability of the whole system consisting of the actor, critic, system state, and system identifier is guaranteed while all three networks undergo adaptation. Convergence to a near-optimal control law is also shown. The effectiveness of the proposed method is illustrated with a simulation example.
引用
收藏
页码:1513 / 1525
页数:13
相关论文
共 50 条
  • [31] Adaptive Control of Uncertain Nonaffine Nonlinear Systems With Input Saturation Using Neural Networks
    Esfandiari, Kasra
    Abdollahi, Farzaneh
    Talebi, Heidar Ali
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (10) : 2311 - 2322
  • [32] Event-trigger-based robust control for nonlinear constrained-input systems using reinforcement learning method
    Yang, Dongsheng
    Li, Ting
    Zhang, Huaguang
    Xie, Xiangpeng
    NEUROCOMPUTING, 2019, 340 : 158 - 170
  • [33] Robust Optimal Control Scheme for Unknown Constrained-Input Nonlinear Systems via a Plug-n-Play Event-Sampled Critic-Only Algorithm
    Zhang, Huaguang
    Zhang, Kun
    Xiao, Geyang
    Jiang, He
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (09): : 3169 - 3180
  • [34] Adaptive nonlinear control using input normalized neural networks
    Henzeh Leeghim
    In-Ho Seo
    Hyochoong Bang
    Journal of Mechanical Science and Technology, 2008, 22
  • [35] Adaptive Optimal Control for Large-Scale Systems based on Robust Policy Iteration
    Zhao, Fuyu
    Zhao, Liang
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 2704 - 2709
  • [36] Adaptive neural control of unknown non-affine nonlinear systems with input deadzone and unknown disturbance
    Shuang Zhang
    Linghuan Kong
    Suwen Qi
    Peng Jing
    Wei He
    Bin Xu
    Nonlinear Dynamics, 2019, 95 : 1283 - 1299
  • [37] Adaptive nonlinear control using input normalized neural networks
    Leeghim, Henzeh
    Seo, In-Ho
    Bang, Hyochoong
    JOURNAL OF MECHANICAL SCIENCE AND TECHNOLOGY, 2008, 22 (06) : 1073 - 1083
  • [38] Adaptive neural control of unknown non-affine nonlinear systems with input deadzone and unknown disturbance
    Zhang, Shuang
    Kong, Linghuan
    Qi, Suwen
    Jing, Peng
    He, Wei
    Xu, Bin
    NONLINEAR DYNAMICS, 2019, 95 (02) : 1283 - 1299
  • [39] Learning-based Optimal Control of Constrained Switched Linear Systems using Neural Networks
    Markolf, Lukas
    Stursberg, Olaf
    PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS (ICINCO), 2021, : 90 - 98
  • [40] Optimal control of policy iteration with adaptive adjustment of window length
    Fang X.
    Luan X.-L.
    Liu F.
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2024, 41 (04): : 745 - 750