Online Learning-based Optimal Control of Nonlinear Systems with Finite-Time Convergence Guarantees

被引:0
作者
Kokolakis, Nick-Marios T. [1 ]
Vamvoudakis, Kyriakos G. [1 ]
机构
[1] Georgia Inst Technol, Daniel Guggenhein Sch Aerosp Engn, Atlanta, GA 30332 USA
来源
2022 AMERICAN CONTROL CONFERENCE, ACC | 2022年
关键词
Adaptive learning; finite-time stability; optimal control; reinforcement learning; autonomy; REINFORCEMENT; STABILIZATION; DESIGN;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper develops a critic-only reinforcement learning-based algorithm for learning the solution to the Hamilton-Jacobi-Bellman equation in finite time. In particular, a non-Lipschitz experience replay-based learning law utilizing recorded and current data is introduced for updating the critic weights to learn the value function. The non-Lipschitz property of the dynamics gives rise to finite-time convergence and stability, while the experience replay-based approach eliminates the need to satisfy the persistence of excitation condition if the recorded data is sufficiently rich. Simulation results demonstrate the efficacy of the proposed approach.
引用
收藏
页码:812 / 817
页数:6
相关论文
共 30 条
  • [1] [Anonymous], 2013, REINFORCEMENT LEARNI, DOI DOI 10.1007/S11916-013-0378-Z
  • [2] NONQUADRATIC COST AND NONLINEAR FEEDBACK-CONTROL
    BERNSTEIN, DS
    [J]. INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 1993, 3 (03) : 211 - 229
  • [3] Bertsekas D., 2019, COMPUTER SCI MATH
  • [4] Finite-time stability of continuous autonomous systems
    Bhat, SP
    Bernstein, DS
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2000, 38 (03) : 751 - 766
  • [5] Concurrent Learning for Convergence in Adaptive Control without Persistency of Excitation
    Chowdhary, Girish
    Johnson, Eric
    [J]. 49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 3674 - 3679
  • [6] Finite-Time Stabilization and Optimal Feedback Control
    Haddad, Wassim M.
    L'Afflitto, Andrea
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (04) : 1069 - 1074
  • [7] Haddad WM., 2011, NONLINEAR DYNAMICAL
  • [8] UNIVERSAL APPROXIMATION OF AN UNKNOWN MAPPING AND ITS DERIVATIVES USING MULTILAYER FEEDFORWARD NETWORKS
    HORNIK, K
    STINCHCOMBE, M
    WHITE, H
    [J]. NEURAL NETWORKS, 1990, 3 (05) : 551 - 560
  • [9] Ioannou P., 2006, Adaptive control tutorial, V11
  • [10] Jiang, 2017, ROBUST ADAPTIVE DYNA