Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

被引:4
|
作者
Luo, Yuwei [1 ]
Gupta, Varun [2 ]
Kolar, Mladen [2 ]
机构
[1] Stanford Univ, 655 Knight Way, Stanford, CA 94305 USA
[2] Univ Chicago, 5807 S Woodlawn Ave, Chicago, IL 60637 USA
关键词
Linear Quadratic Regulator; dynamic regret; non-stationary learning; ordinary least squares estimator; TIME;
D O I
10.1145/3508029
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of controlling a Linear Quadratic Regulator (LQR) system over a finite horizon T with fixed and known cost matrices Q, R, but unknown and non-stationary dynamics {A(t), B-t}. The sequence of dynamics matrices can be arbitrary, but with a total variation, V-T, assumed to be o(T) and unknown to the controller. Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all t, we present an algorithm that achieves the optimal dynamic regret of (O) over tilde ((VTT3/5)-T-2/5). With piecewise constant dynamics, our algorithm achieves the optimal regret of (O) over tilde(root ST) where S is the number of switches. The crux of our algorithm is an adaptive non-stationarity detection strategy, which builds on an approach recently developed for contextual Multi-armed Bandit problems. We also argue that non-adaptive forgetting (e.g., restarting or using sliding window learning with a static window size) may not be regret optimal for the LQR problem, even when the window size is optimally tuned with the knowledge of V-T. The main technical challenge in the analysis of our algorithm is to prove that the ordinary least squares (OLS) estimator has a small bias when the parameter to be estimated is non-stationary. Our analysis also highlights that the key motif driving the regret is that the LQR problem is in spirit a bandit problem with linear feedback and locally quadratic cost. This motif is more universal than the LQR problem itself, and therefore we believe our results should find wider application.
引用
收藏
页数:72
相关论文
共 50 条
  • [31] Artificial Neural Network Model for Analysis of Linear Dynamic Systems Subject to Non-stationary Excitations
    Posayanant, Pawarid
    Chantarawichit, Patiphan
    Dy, Damang
    Sompornjaroensuk, Yos
    THAI JOURNAL OF MATHEMATICS, 2023, 21 (01): : 183 - 199
  • [32] SOME ANALYTICAL PROCEDURES OF NON-STATIONARY CONTROL SYSTEMS
    GIADROSS.G
    TORRIANO, D
    ELETTROTECNICA, 1969, 56 (8A): : 488 - &
  • [33] Identification and control of non-stationary time delayed systems
    Dréano, P
    Laurent, R
    ROBUST CONTROL DESIGN 2000, VOLS 1 & 2, 2000, 1-2 : 267 - 271
  • [34] Structural and parametric non-stationary modal control systems
    Fedosenkov, D. B.
    Simikova, A. A.
    Fedosenkov, B. A.
    XII ALL-RUSSIAN SCIENTIFIC AND PRACTICAL CONFERENCE (WITH INTERNATIONAL PARTICIPATION) ON AUTOMATION SYSTEMS IN EDUCATION, SCIENCE AND PRODUCTION, 2019, 2020, 865
  • [35] Non-stationary iterative solution of positive definite linear systems
    Santos, NR
    Evans, DJ
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 1999, 72 (03) : 289 - 304
  • [36] RESPONSE OF LINEAR VIBRATORY SYSTEMS TO NON-STATIONARY STOCHASTIC IMPULSES
    SRINIVASAN, SK
    SUBRAMANIAN, R
    KUMARASWAMY, S
    JOURNAL OF SOUND AND VIBRATION, 1967, 6 (02) : 169 - +
  • [37] Non-stationary parallel multisplitting algorithms for almost linear systems
    Arnal, J
    Migallón, V
    Penadés, J
    NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 1999, 6 (02) : 79 - 92
  • [38] Optimal Control of Non-stationary Differential Linear Repetitive Processes
    S. Dymkou
    M. Dymkov
    E. Rogers
    K. Galkowski
    Integral Equations and Operator Theory, 2008, 60 : 201 - 216
  • [39] Optimal control of non-stationary differential linear repetitive processes
    Dymkou, S.
    Dymkov, M.
    Rogers, E.
    Galkowski, K.
    INTEGRAL EQUATIONS AND OPERATOR THEORY, 2008, 60 (02) : 201 - 216
  • [40] Online Learning with Non-Convex Losses and Non-Stationary Regret
    Gao, Xiang
    Li, Xiaobo
    Zhang, Shuzhong
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84