A reinforcement learning method with closed-loop stability guarantee

被引:5
|
作者
Osinenko, Pavel [1 ]
Beckenbach, Lukas [1 ]
Goehrt, Thomas [1 ]
Streif, Stefan [1 ]
机构
[1] Tech Univ Chemnitz, Automat Control & Syst Dynam Lab, Chemnitz, Germany
来源
IFAC PAPERSONLINE | 2020年 / 53卷 / 02期
关键词
Reinforcement learning control; Stability of nonlinear systems; Lyapunov methods; ITERATION;
D O I
10.1016/j.ifacol.2020.12.2237
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning (RL) in the context of control systems offers wide possibilities of controller adaptation. Given an infinite-horizon cost function, the so-called critic of RL approximates it with a neural net and sends this information to the controller (called "actor"). However, the issue of closed-loop stability under an RL-method is still not fully addressed. Since the critic delivers merely an approximation to the value function of the corresponding infinitehorizon problem, no guarantee can be given in general as to whether the actor's actions stabilize the system. Different approaches to this issue exist. The current work offers a particular one, which, starting with a (not necessarily smooth) control Lyapunov function (CLF), derives an online RL-scheme in such a way that practical semi-global stability property of the closed-loop can be established. The approach logically continues the work of the authors on parameterized controllers and Lyapunov-like constraints for RL, whereas the CLF now appears merely in one of the constraints of the control scheme. The analysis of the closed-loop behavior is done in a sample-and-hold (SH) manner thus offering a certain insight into the digital realization. The case study with a non-holonomic integrator shows the capabilities of the derived method to optimize the given cost function compared to a nominal stabilizing controller. Copyright (C) 2020 The Authors.
引用
收藏
页码:8043 / 8048
页数:6
相关论文
共 50 条
  • [41] Learning of Closed-Loop Motion Control
    Farshidian, Farbod
    Neunert, Michael
    Buchli, Jonas
    2014 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2014), 2014, : 1441 - 1446
  • [42] Feedback closed-loop scheduling discipline for QoS guarantee in mobile applications
    Chen, JL
    Chen, NK
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND NETWORKS, 2004, : 201 - 206
  • [43] Feedback Closed-Loop Scheduling Discipline for QoS Guarantee in Mobile Applications
    Jiann-Liang Chen
    Nong-Kun Chen
    Wireless Networks, 2006, 12 : 223 - 232
  • [44] Feedback closed-loop scheduling discipline for QoS guarantee in mobile applications
    Chen, JL
    Chen, NK
    WIRELESS NETWORKS, 2006, 12 (02) : 223 - 232
  • [45] Closed-Loop System to Guarantee Battery Lifetime for Mobile Video Applications
    Groba, Angel M.
    Lobo, Pedro J.
    Chavarrias, Miguel
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2019, 65 (01) : 18 - 27
  • [46] Multi-asset closed-loop reservoir management using deep reinforcement learning
    Yusuf Nasir
    Louis J. Durlofsky
    Computational Geosciences, 2024, 28 : 23 - 42
  • [47] Path planning via reinforcement learning with closed-loop motion control and field tests
    Feher, Arpad
    Domina, Adam
    Bardos, Adam
    Aradi, Szilard
    Becsi, Tamas
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 142
  • [48] Multi-asset closed-loop reservoir management using deep reinforcement learning
    Nasir, Yusuf
    Durlofsky, Louis J.
    COMPUTATIONAL GEOSCIENCES, 2024, 28 (01) : 23 - 42
  • [49] The limitation of the closed-loop combination method
    Wong, YK
    Fung, EHK
    Liu, HHT
    Li, YC
    2005 IEEE INTERNATIONAL CONFERENCE ON CONTROL APPLICATIONS (CCA), VOLS 1AND 2, 2005, : 693 - 698
  • [50] METHOD OF CLOSED-LOOP DIGITAL CONTROL
    MONTGOMERIE, GA
    KEELING, GW
    MAY, D
    PROCEEDINGS OF THE INSTITUTION OF ELECTRICAL ENGINEERS-LONDON, 1969, 116 (08): : 1445 - +