Reinforcement Learning of Structured Stabilizing Control for Linear Systems With Unknown State Matrix

被引:5
作者
Mukherjee, Sayak [1 ]
Vu, Thanh Long [1 ]
机构
[1] Pacific Northwest Natl Lab PNNL, Optimizat & Control Grp, Richland, WA 99354 USA
关键词
Heuristic algorithms; Computational modeling; Optimal control; Feedback control; Dynamical systems; Adaptation models; Trajectory; Distributed control; linear quadratic regulator (LQR); reinforcement learning (RL); stability guarantee; structured learning; DISTRIBUTED CONTROL;
D O I
10.1109/TAC.2022.3155384
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article delves into designing stabilizing feedback control gains for continuous-time linear systems with unknown state matrix, in which the control gain is subjected to a structural constraint. We bring forth the ideas from reinforcement learning (RL) in conjunction with sufficient stability and performance guarantees in order to design these structured gains using the trajectory measurements of states and controls. We first formulate a model-based linear quadratic regulator (LQR) framework to compute the structured control gain. Subsequently, we transform this model-based LQR formulation into a data-driven RL algorithm to remove the need for knowing the system state matrix. Theoretical guarantees are provided for the stability of the closed-loop system and the convergence of the structured RL (SRL) algorithm. A remarkable application of the proposed SRL framework is in designing distributed static feedback control, which is necessary for automatic control of many large-scale cyber-physical systems. As such, we validate our theoretical results with numerical simulations on a multiagent networked linear time-invariant dynamical system.
引用
收藏
页码:1746 / 1752
页数:7
相关论文
共 29 条
  • [1] [Anonymous], 2007, Wiley Series in Probability and Statistics
  • [2] Decentralized control: An overview
    Bakule, Lubomir
    [J]. ANNUAL REVIEWS IN CONTROL, 2008, 32 (01) : 87 - 98
  • [3] Bertsekas D., 2012, DYNAMIC PROGRAMMING
  • [4] Deroo F., 2016, THESIS TU MUNCHEN MU
  • [5] Fardad Makan, 2011, 2011 American Control Conference - ACC 2011, P2050
  • [6] Furieri L, 2020, PR MACH LEARN RES, V120, P287
  • [7] Geromel J.C., 1984, IFAC P VOLUMES, V17, P435
  • [8] Ioannou P, 2006, ADV DES CONTROL, P1, DOI 10.1137/1.9780898718652
  • [9] Jiang Y., 2017, Robust Adaptive Dynamic Programming
  • [10] Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems
    Jiang, Yu
    Jiang, Zhong-Ping
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2012, 59 (10) : 693 - 697