Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

被引:4
|
作者
Luo, Yuwei [1 ]
Gupta, Varun [2 ]
Kolar, Mladen [2 ]
机构
[1] Stanford Univ, 655 Knight Way, Stanford, CA 94305 USA
[2] Univ Chicago, 5807 S Woodlawn Ave, Chicago, IL 60637 USA
关键词
Linear Quadratic Regulator; dynamic regret; non-stationary learning; ordinary least squares estimator; TIME;
D O I
10.1145/3508029
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over a finite horizon T with fixed and known cost matrices Q, R, but unknown and non-stationary dynamics {A(t), B-t}. The sequence of dynamics matrices can be arbitrary, but with a total variation, V-T, assumed to be o(T) and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all t, we present an algorithm that achieves the optimal dynamic regret of (O) over tilde ((VTT3/5)-T-2/5). With piecewise constant dynamics, our algorithm achieves the optimal regret of (O) over tilde(root ST) where S is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection strategy, which builds on an approach recently developed for contextual Multi-armed Bandit problems. We also argue that non-adaptive forgetting (e.g., restarting or using sliding window learning with a static window size) may not be regret optimal for the LQR problem, even when the window size is optimally tuned with the knowledge of V-T. The main technical challenge in the analysis of our algorithm is to prove that the ordinary least squares (OLS) estimator has a small bias when the parameter to be estimated is non-stationary. Our analysis also highlights that the key motif driving the regret is that the LQR problem is in spirit a bandit problem with linear feedback and locally quadratic cost. This motif is more universal than the LQR problem itself, and therefore we believe our results should find wider application.
引用
收藏
页数:72
相关论文
共 50 条
  • [42] On existence of non-stationary realization of a linear multivariable control system
    Rusanov, V. A.
    Daneev, A. V.
    Kumenko, A. E.
    Sharpinskiy, D. Yu.
    ICSENG 2008: INTERNATIONAL CONFERENCE ON SYSTEMS ENGINEERING, 2008, : 27 - 31
  • [43] Temporal correlation functions of dynamic systems in non-stationary states
    Chen, T. T.
    Zheng, B.
    Li, Y.
    Jiang, X. F.
    NEW JOURNAL OF PHYSICS, 2018, 20
  • [44] Implications of Regret on Stability of Linear Dynamical Systems
    Karapetyan, Aren
    Tsiamis, Anastasios
    Balta, Efe C.
    Iannelli, Andrea
    Lygeros, John
    IFAC PAPERSONLINE, 2023, 56 (02): : 2583 - 2588
  • [45] Analytical minimization of cross cumulant for stationary and non-stationary sources recovery
    Smatti, El Mouataz Billah
    Arar, Djemai
    ANALOG INTEGRATED CIRCUITS AND SIGNAL PROCESSING, 2025, 123 (01)
  • [46] A review on prognostic techniques for non-stationary and non-linear rotating systems
    Kan, Man Shan
    Tan, Andy C. C.
    Mathew, Joseph
    MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2015, 62-63 : 1 - 20
  • [47] Recursive identification for Wiener non-linear systems with non-stationary disturbances
    Dong, Shijian
    Yu, Li
    Zhang, Wen-An
    Chen, Bo
    IET CONTROL THEORY AND APPLICATIONS, 2019, 13 (16): : 2648 - 2657
  • [48] A Computational Intelligence Technique for the Identification of Non-Linear Non-Stationary Systems
    Turchetti, Claudio
    Gianfelici, Francesco
    Biagetti, Giorgio
    Crippa, Paolo
    2008 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-8, 2008, : 3034 - 3038
  • [49] On the identification of non-stationary linear processes
    Bouzeghoub, MC
    Ellacott, SW
    Easdown, A
    Brown, M
    INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2000, 31 (03) : 273 - 286
  • [50] ALMOST SURELY INVARIANCE PRINCIPLE FOR NON-STATIONARY AND RANDOM INTERMITTENT DYNAMICAL SYSTEMS
    Su, Yaofeng
    DISCRETE AND CONTINUOUS DYNAMICAL SYSTEMS, 2019, 39 (11) : 6585 - 6597