On Distributed Model-Free Reinforcement Learning Control With Stability Guarantee

被引：2

作者：

Mukherjee, Sayak ^{[1
]}

Vu, Thanh Long ^{[1
]}

机构：

[1] Pacific Northwest Natl Lab, Optimizat & Control Grp, Richland, WA 99354 USA

来源：

IEEE CONTROL SYSTEMS LETTERS | 2021年 / 5卷 / 05期

关键词：

Feedback control; Power system stability; Eigenvalues and eigenfunctions; Decision making; Computational modeling; Mathematical model; Dynamical systems; Distributed control; learning control; reinforcement learning; stability guarantee; interconnected systems; TIME LINEAR-SYSTEMS; DESIGN;

D O I：

10.1109/LCSYS.2020.3041218

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Distributed learning can enable scalable and effective decision making in numerous complex cyber-physical systems such as smart transportation, robotics swarm, power systems, etc. However, stability of the system is usually not guaranteed in most existing learning paradigms; and this limitation can hinder the wide deployment of machine learning in decision making of safety-critical systems. This letter presents a stability-guaranteed distributed reinforcement learning (SGDRL) framework for interconnected linear subsystems, without knowing the subsystem models. While the learning process requires data from a peer-to-peer (p2p) communication architecture, the control implementation of each subsystem is only based on its local states. The stability of the interconnected subsystems will be ensured by a diagonally dominant eigenvalue condition, which will then be used in a model-free RL algorithm to learn the stabilizing control gains. The RL algorithm structure follows an off-policy iterative framework, with interleaved policy evaluation and policy update steps. We numerically validate our theoretical results by performing simulations on four interconnected sub-systems.

引用

页码：1615 / 1620

页数：6

共 26 条

[1] LINEAR SYSTEM OPTIMISATION WITH PRESCRIBED DEGREE OF STABILITY
ANDERSON, BD
MOORE, JB
[J]. PROCEEDINGS OF THE INSTITUTION OF ELECTRICAL ENGINEERS-LONDON, 1969, 116 (12): : 2083 - +
[2] [Anonymous], 1989, (Ph.D. thesis
[3] [Anonymous], 2007, Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)
[4] Decentralized control: An overview
Bakule, Lubomir
[J]. ANNUAL REVIEWS IN CONTROL, 2008, 32 (01) : 87 - 98
[5] Bertsekas DP, 2017, DYNAMIC PROGRAMMING, V4th
[6] Busoniu L, 2010, STUD COMPUT INTELL, V310, P183
[7] Deroo F., 2016, THESIS
[8] Fardad Makan, 2011, 2011 American Control Conference - ACC 2011, P2050
[9] Furieri L., 2020, LEARNING GLOBALLY OP
[10] Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems
Jiang, Yu
Jiang, Zhong-Ping
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2012, 59 (10) : 693 - 697

← 1 2 3 →