Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

被引：4

作者：

Luo, Yuwei ^{[1
]}

Gupta, Varun ^{[2
]}

Kolar, Mladen ^{[2
]}

机构：

[1] Stanford Univ, 655 Knight Way, Stanford, CA 94305 USA

[2] Univ Chicago, 5807 S Woodlawn Ave, Chicago, IL 60637 USA

来源：

PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS | 2022年 / 6卷 / 01期

关键词：

Linear Quadratic Regulator; dynamic regret; non-stationary learning; ordinary least squares estimator; TIME;

D O I：

10.1145/3508029

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over a finite horizon T with fixed and known cost matrices Q, R, but unknown and non-stationary dynamics {A(t), B-t}. The sequence of dynamics matrices can be arbitrary, but with a total variation, V-T, assumed to be o(T) and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all t, we present an algorithm that achieves the optimal dynamic regret of (O) over tilde ((VTT3/5)-T-2/5). With piecewise constant dynamics, our algorithm achieves the optimal regret of (O) over tilde(root ST) where S is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection strategy, which builds on an approach recently developed for contextual Multi-armed Bandit problems. We also argue that non-adaptive forgetting (e.g., restarting or using sliding window learning with a static window size) may not be regret optimal for the LQR problem, even when the window size is optimally tuned with the knowledge of V-T. The main technical challenge in the analysis of our algorithm is to prove that the ordinary least squares (OLS) estimator has a small bias when the parameter to be estimated is non-stationary. Our analysis also highlights that the key motif driving the regret is that the LQR problem is in spirit a bandit problem with linear feedback and locally quadratic cost. This motif is more universal than the LQR problem itself, and therefore we believe our results should find wider application.

引用

页数：72

共 50 条

[41] COVARIANCE RESPONSE OF LINEAR SYSTEMS TO NON-STATIONARY RANDOM EXCITATION
ROBERTS, JB
JOURNAL OF SOUND AND VIBRATION, 1971, 14 (03) : 385 - &
[42] On existence of non-stationary realization of a linear multivariable control system
Rusanov, V. A.
Daneev, A. V.
Kumenko, A. E.
Sharpinskiy, D. Yu.
ICSENG 2008: INTERNATIONAL CONFERENCE ON SYSTEMS ENGINEERING, 2008, : 27 - 31
[43] Temporal correlation functions of dynamic systems in non-stationary states
Chen, T. T.
Zheng, B.
Li, Y.
Jiang, X. F.
NEW JOURNAL OF PHYSICS, 2018, 20
[44] Implications of Regret on Stability of Linear Dynamical Systems
Karapetyan, Aren
Tsiamis, Anastasios
Balta, Efe C.
Iannelli, Andrea
Lygeros, John
IFAC PAPERSONLINE, 2023, 56 (02): : 2583 - 2588
[45] Analytical minimization of cross cumulant for stationary and non-stationary sources recovery
Smatti, El Mouataz Billah
Arar, Djemai
ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING, 2025, 123 (01)
[46] A review on prognostic techniques for non-stationary and non-linear rotating systems
Kan, Man Shan
Tan, Andy C. C.
Mathew, Joseph
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2015, 62-63 : 1 - 20
[47] Recursive identification for Wiener non-linear systems with non-stationary disturbances
Dong, Shijian
Yu, Li
Zhang, Wen-An
Chen, Bo
IET CONTROL THEORY AND APPLICATIONS, 2019, 13 (16): : 2648 - 2657
[48] A Computational Intelligence Technique for the Identification of Non-Linear Non-Stationary Systems
Turchetti, Claudio
Gianfelici, Francesco
Biagetti, Giorgio
Crippa, Paolo
2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3034 - 3038
[49] On the identification of non-stationary linear processes
Bouzeghoub, MC
Ellacott, SW
Easdown, A
Brown, M
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2000, 31 (03) : 273 - 286
[50] ALMOST SURELY INVARIANCE PRINCIPLE FOR NON-STATIONARY AND RANDOM INTERMITTENT DYNAMICAL SYSTEMS
Su, Yaofeng
DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS, 2019, 39 (11) : 6585 - 6597

← 1 2 3 4 5 →