Two-timescale stochastic gradient descent in continuous time with applications to joint online parameter estimation and optimal sensor placement

被引:3
作者
Sharrock, Louis [1 ]
Kantas, Nikolas [1 ]
机构
[1] Imperial Coll London, Dept Math, London SW7 2AZ, England
基金
英国工程与自然科学研究理事会;
关键词
Two-timescale stochastic approximation; stochastic gradient descent; recursive maximum likelihood; online parameter estimation; optimal sensor placement; Bene? filter; Kalman-Bucy filter; MAXIMUM-LIKELIHOOD-ESTIMATION; CONVERGENCE RATE; DIFFUSION-APPROXIMATION; ASYMPTOTIC PROPERTIES; POISSON EQUATION; STABILITY; IDENTIFICATION; ALGORITHMS; PROJECTION;
D O I
10.3150/22-BEJ1493
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we establish the almost sure convergence of two-timescale stochastic gradient descent algorithms in continuous time under general noise and stability conditions, extending well known results in discrete time. We analyse algorithms with additive noise and those with non-additive noise. In the non-additive case, our analysis is carried out under the assumption that the noise is a continuous-time Markov process, controlled by the algorithm states. The algorithms we consider can be applied to a broad class of bilevel optimisation problems. We study one such problem in detail, namely, the problem of joint online parameter estimation and optimal sensor placement for a partially observed diffusion process. We demonstrate how this can be formulated as a bilevel optimisation problem, and propose a solution in the form of a continuous-time, two-timescale, stochastic gradient descent algorithm. Furthermore, under suitable conditions on the latent signal, the filter, and the filter derivatives, we establish almost sure convergence of the online parameter estimates and optimal sensor placements to the stationary points of the asymptotic log-likelihood and asymptotic filter covariance, respectively. We also provide numerical examples, illustrating the application of the proposed methodology to a partially observed Benes equation, and a partially observed stochastic advection-diffusion equation.
引用
收藏
页码:1137 / 1165
页数:29
相关论文
共 90 条
[1]   DETERMINATION OF OPTIMAL COSTLY MEASUREMENT STRATEGIES FOR LINEAR STOCHASTIC SYSTEMS [J].
ATHANS, M .
AUTOMATICA, 1972, 8 (04) :397-412
[2]  
Balakrishnan A.V., 1973, LECT NOTES EC MATH S, V84
[3]  
Benaïm M, 1999, LECT NOTES MATH, V1709, P1
[4]   A dynamical system approach to stochastic approximations [J].
Benaim, M .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1996, 34 (02) :437-472
[5]  
Benes V. E., 1981, Stochastics, V5, P65, DOI 10.1080/17442508108833174
[6]  
Benveniste A., 1990, Adaptive Algorithms and Stochastic Approximations
[7]   Gradient convergence in gradient methods with errors [J].
Bertsekas, DP ;
Tsitsiklis, JN .
SIAM JOURNAL ON OPTIMIZATION, 2000, 10 (03) :627-642
[8]  
Bhatnagar S, 2001, IIE TRANS, V33, P245
[9]   Online drift estimation for jump-diffusion processes [J].
Bhudisaksang, Theerawat ;
Cartea, Alvaro .
BERNOULLI, 2021, 27 (04) :2494-2518
[10]  
BISHWAL J. P. N., 2008, Lecture Notes in Math., V1923, DOI [10.1007/978-3-540-74448-1, DOI 10.1007/978-3-540-74448-1]