THE ZIG-ZAG PROCESS AND SUPER-EFFICIENT SAMPLING FOR BAYESIAN ANALYSIS OF BIG DATA

被引:112
|
作者
Ierkens, Joris B. [1 ,2 ]
Fearnhead, Paul [3 ]
Roberts, Gareth [4 ]
机构
[1] Delft Univ Technol, Delft, Netherlands
[2] Delft Inst Appl Math, Mourik Broekmanweg 6, NL-2628 XE Delft, Netherlands
[3] Univ Lancaster, Fylde Coll, Dept Math & Stat, Lancaster LA1 4YF, England
[4] Univ Warwick, Dept Stat, Coventry CV4 7AL, W Midlands, England
来源
ANNALS OF STATISTICS | 2019年 / 47卷 / 03期
基金
英国工程与自然科学研究理事会;
关键词
MCMC; nonreversible Markov process; piecewise deterministic Markov process; stochastic gradient Langevin dynamics; sub-sampling; exact sampling; LONG-TIME BEHAVIOR; VARIANCE REDUCTION; SIMULATION;
D O I
10.1214/18-AOS1715
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multidimensional version of the Zig-Zag process of [Ann. Appl. Probab. 27 (2017) 846-882], a continuous-time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible nonreversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, that is, the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial preprocessing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.
引用
收藏
页码:1288 / 1320
页数:33
相关论文
共 14 条