Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks

被引:361
作者
Modares, Hamidreza [1 ]
Lewis, Frank L. [2 ]
Naghibi-Sistani, Mohammad-Bagher [1 ]
机构
[1] Ferdowsi Univ Mashhad, Dept Elect Engn, Mashhad, Iran
[2] Univ Texas Arlington, Res Inst, Ft Worth, TX 76118 USA
基金
美国国家科学基金会;
关键词
Input constraints; neural networks; optimal control; reinforcement learning; unknown dynamics; CONTINUOUS-TIME;
D O I
10.1109/TNNLS.2013.2276571
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an online policy iteration (PI) algorithm to learn the continuous-time optimal control solution for unknown constrained-input systems. The proposed PI algorithm is implemented on an actor-critic structure where two neural networks (NNs) are tuned online and simultaneously to generate the optimal bounded control policy. The requirement of complete knowledge of the system dynamics is obviated by employing a novel NN identifier in conjunction with the actor and critic NNs. It is shown how the identifier weights estimation error affects the convergence of the critic NN. A novel learning rule is developed to guarantee that the identifier weights converge to small neighborhoods of their ideal values exponentially fast. To provide an easy-to-check persistence of excitation condition, the experience replay technique is used. That is, recorded past experiences are used simultaneously with current data for the adaptation of the identifier weights. Stability of the whole system consisting of the actor, critic, system state, and system identifier is guaranteed while all three networks undergo adaptation. Convergence to a near-optimal control law is also shown. The effectiveness of the proposed method is illustrated with a simulation example.
引用
收藏
页码:1513 / 1525
页数:13
相关论文
共 50 条
  • [1] A policy iteration approach to online optimal control of continuous-time constrained-input systems
    Modares, Hamidreza
    Sistani, Mohammad-Bagher Naghibi
    Lewis, Frank L.
    ISA TRANSACTIONS, 2013, 52 (05) : 611 - 621
  • [2] Reinforcement learning-based optimal control of unknown constrained-input nonlinear systems using simulated experience
    Asl, Hamed Jabbari
    Uchibe, Eiji
    NONLINEAR DYNAMICS, 2023, 111 (17) : 16093 - 16110
  • [3] Reinforcement learning-based optimal control of unknown constrained-input nonlinear systems using simulated experience
    Hamed Jabbari Asl
    Eiji Uchibe
    Nonlinear Dynamics, 2023, 111 : 16093 - 16110
  • [4] Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems
    Modares, Hamidreza
    Lewis, Frank L.
    Naghibi-Sistani, Mohammad-Bagher
    AUTOMATICA, 2014, 50 (01) : 193 - 202
  • [5] Reinforcement Learning-Based Nearly Optimal Control for Constrained-Input Partially Unknown Systems Using Differentiator
    Guo, Xinxin
    Yan, Weisheng
    Cui, Rongxin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (11) : 4713 - 4725
  • [6] Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning
    Yang, Xiong
    Liu, Derong
    Luo, Biao
    Li, Chao
    INFORMATION SCIENCES, 2016, 369 : 731 - 747
  • [7] Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning
    Modares, Hamidreza
    Lewis, Frank L.
    AUTOMATICA, 2014, 50 (07) : 1780 - 1792
  • [8] Adaptive Dynamic Programming for H∞ Control of Constrained-Input Nonlinear Systems
    Yang Xiong
    Liu Derong
    Wei Qinglai
    Wang Ding
    2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 3027 - 3032
  • [9] An off-policy iteration algorithm for robust stabilization of constrained-input uncertain nonlinear systems
    Yang, Xiong
    Wei, Qinglai
    INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2018, 28 (18) : 5747 - 5765
  • [10] Optimal Leader-Follower Consensus for Constrained-Input Multiagent Systems With Completely Unknown Dynamics
    Shi, Jing
    Yue, Dong
    Xie, Xiangpeng
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2022, 52 (02): : 1182 - 1191