Online Learning-based Optimal Control of Nonlinear Systems with Finite-Time Convergence Guarantees

被引：0

作者：

Kokolakis, Nick-Marios T. ^{[1
]}

Vamvoudakis, Kyriakos G. ^{[1
]}

机构：

[1] Georgia Inst Technol, Daniel Guggenhein Sch Aerosp Engn, Atlanta, GA 30332 USA

来源：

2022 AMERICAN CONTROL CONFERENCE, ACC | 2022年

关键词：

Adaptive learning; finite-time stability; optimal control; reinforcement learning; autonomy; REINFORCEMENT; STABILIZATION; DESIGN;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper develops a critic-only reinforcement learning-based algorithm for learning the solution to the Hamilton-Jacobi-Bellman equation in finite time. In particular, a non-Lipschitz experience replay-based learning law utilizing recorded and current data is introduced for updating the critic weights to learn the value function. The non-Lipschitz property of the dynamics gives rise to finite-time convergence and stability, while the experience replay-based approach eliminates the need to satisfy the persistence of excitation condition if the recorded data is sufficiently rich. Simulation results demonstrate the efficacy of the proposed approach.

引用

页码：812 / 817

页数：6

共 30 条

[1] [Anonymous], 2013, REINFORCEMENT LEARNI, DOI DOI 10.1007/S11916-013-0378-Z
[2] NONQUADRATIC COST AND NONLINEAR FEEDBACK-CONTROL
BERNSTEIN, DS
[J]. INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 1993, 3 (03) : 211 - 229
[3] Bertsekas D., 2019, COMPUTER SCI MATH
[4] Finite-time stability of continuous autonomous systems
Bhat, SP
Bernstein, DS
[J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2000, 38 (03) : 751 - 766
[5] Concurrent Learning for Convergence in Adaptive Control without Persistency of Excitation
Chowdhary, Girish
Johnson, Eric
[J]. 49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 3674 - 3679
[6] Finite-Time Stabilization and Optimal Feedback Control
Haddad, Wassim M.
L'Afflitto, Andrea
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (04) : 1069 - 1074
[7] Haddad WM., 2011, NONLINEAR DYNAMICAL
[8] UNIVERSAL APPROXIMATION OF AN UNKNOWN MAPPING AND ITS DERIVATIVES USING MULTILAYER FEEDFORWARD NETWORKS
HORNIK, K
STINCHCOMBE, M
WHITE, H
[J]. NEURAL NETWORKS, 1990, 3 (05) : 551 - 560
[9] Ioannou P., 2006, Adaptive control tutorial, V11
[10] Jiang, 2017, ROBUST ADAPTIVE DYNA

← 1 2 3 →