Online Gradient Descent for Linear Dynamical Systems

被引：13

作者：

Nonhoff, Marko ^{[1
]}

Mueller, Matthias A. ^{[1
]}

机构：

[1] Leibniz Univ Hannover, Inst Automat Control, D-30167 Hannover, Germany

来源：

IFAC PAPERSONLINE | 2020年 / 53卷 / 02期

关键词：

Online convex optimization; linear systems; online learning; online gradient descent; predictive control; real-time optimal control; OPTIMIZATION;

D O I：

10.1016/j.ifacol.2020.12.1258

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, online convex optimization is applied to the problem of controlling linear dynamical systems. An algorithm similar to online gradient descent, which can handle time-varying and unknown cost functions, is proposed. Then, performance guarantees are derived in terms of regret analysis. We show that the proposed control scheme achieves sublinear regret if the variation of the cost functions is sublinear. In addition, as a special case, the system converges to the optimal equilibrium if the cost functions are invariant after some finite time. Finally, the performance of the resulting closed loop is illustrated by numerical simulations. Copyright (C) 2020 The Authors.

引用

页码：945 / 952

页数：8

共 24 条

[1]

Abbasi-Yadkori Y, 2014, PR MACH LEARN RES, V32

[2]

Agarwal N., 2019, PROC INT C MACH LEAR, P111

[3]

Akbari M., 2019, ARXIV191209451

[4]

[Anonymous], 2014, Convex Optimiza- tion

[5]

[Anonymous], 2003, PROC INT C MACHINE L

[6] Non-Stationary Stochastic Optimization [J].

Besbes, Omar ;

Gur, Yonatan ;

Zeevi, Assaf .

OPERATIONS RESEARCH, 2015, 63 (05) :1227-1244

[7] Online Convex Optimization With Time-Varying Constraints and Bandit Feedback [J].

Cao, Xuanyu ;

Liu, K. J. Ray .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (07) :2665-2680

[8]

Cohen A, 2018, PR MACH LEARN RES, V80

[9] Online Optimization as a Feedback Controller: Stability and Tracking [J].

Colombino, Marcello ;

Dall'Anese, Emiliano ;

Bernstein, Andrey .

IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2020, 7 (01) :422-432

[10]

Diehl M, 2005, IEE P-CONTR THEOR AP, V152, P296, DOI 10.1049/ip-cta:20040008

← 1 2 3 →