A reinforcement learning method with closed-loop stability guarantee

被引：5

作者：

Osinenko, Pavel ^{[1
]}

Beckenbach, Lukas ^{[1
]}

Goehrt, Thomas ^{[1
]}

Streif, Stefan ^{[1
]}

机构：

[1] Tech Univ Chemnitz, Automat Control & Syst Dynam Lab, Chemnitz, Germany

来源：

IFAC PAPERSONLINE | 2020年 / 53卷 / 02期

关键词：

Reinforcement learning control; Stability of nonlinear systems; Lyapunov methods; ITERATION;

D O I：

10.1016/j.ifacol.2020.12.2237

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning (RL) in the context of control systems offers wide possibilities of controller adaptation. Given an infinite-horizon cost function, the so-called critic of RL approximates it with a neural net and sends this information to the controller (called "actor"). However, the issue of closed-loop stability under an RL-method is still not fully addressed. Since the critic delivers merely an approximation to the value function of the corresponding infinitehorizon problem, no guarantee can be given in general as to whether the actor's actions stabilize the system. Different approaches to this issue exist. The current work offers a particular one, which, starting with a (not necessarily smooth) control Lyapunov function (CLF), derives an online RL-scheme in such a way that practical semi-global stability property of the closed-loop can be established. The approach logically continues the work of the authors on parameterized controllers and Lyapunov-like constraints for RL, whereas the CLF now appears merely in one of the constraints of the control scheme. The analysis of the closed-loop behavior is done in a sample-and-hold (SH) manner thus offering a certain insight into the digital realization. The case study with a non-holonomic integrator shows the capabilities of the derived method to optimize the given cost function compared to a nominal stabilizing controller. Copyright (C) 2020 The Authors.

引用

页码：8043 / 8048

页数：6

共 50 条

[41] Learning of Closed-Loop Motion Control
Farshidian, Farbod
Neunert, Michael
Buchli, Jonas
2014 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2014), 2014, : 1441 - 1446
[42] Feedback closed-loop scheduling discipline for QoS guarantee in mobile applications
Chen, JL
Chen, NK
PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND NETWORKS, 2004, : 201 - 206
[43] Feedback Closed-Loop Scheduling Discipline for QoS Guarantee in Mobile Applications
Jiann-Liang Chen
Nong-Kun Chen
Wireless Networks, 2006, 12 : 223 - 232
[44] Feedback closed-loop scheduling discipline for QoS guarantee in mobile applications
Chen, JL
Chen, NK
WIRELESS NETWORKS, 2006, 12 (02) : 223 - 232
[45] Closed-Loop System to Guarantee Battery Lifetime for Mobile Video Applications
Groba, Angel M.
Lobo, Pedro J.
Chavarrias, Miguel
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2019, 65 (01) : 18 - 27
[46] Multi-asset closed-loop reservoir management using deep reinforcement learning
Yusuf Nasir
Louis J. Durlofsky
Computational Geosciences, 2024, 28 : 23 - 42
[47] Path planning via reinforcement learning with closed-loop motion control and field tests
Feher, Arpad
Domina, Adam
Bardos, Adam
Aradi, Szilard
Becsi, Tamas
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 142
[48] Multi-asset closed-loop reservoir management using deep reinforcement learning
Nasir, Yusuf
Durlofsky, Louis J.
COMPUTATIONAL GEOSCIENCES, 2024, 28 (01) : 23 - 42
[49] The limitation of the closed-loop combination method
Wong, YK
Fung, EHK
Liu, HHT
Li, YC
2005 IEEE INTERNATIONAL CONFERENCE ON CONTROL APPLICATIONS (CCA), VOLS 1AND 2, 2005, : 693 - 698
[50] METHOD OF CLOSED-LOOP DIGITAL CONTROL
MONTGOMERIE, GA
KEELING, GW
MAY, D
PROCEEDINGS OF THE INSTITUTION OF ELECTRICAL ENGINEERS-LONDON, 1969, 116 (08): : 1445 - +

← 1 2 3 4 5 →