THE ZIG-ZAG PROCESS AND SUPER-EFFICIENT SAMPLING FOR BAYESIAN ANALYSIS OF BIG DATA

被引：112

作者：

Ierkens, Joris B. ^{[1
,2
]}

Fearnhead, Paul ^{[3
]}

Roberts, Gareth ^{[4
]}

机构：

[1] Delft Univ Technol, Delft, Netherlands

[2] Delft Inst Appl Math, Mourik Broekmanweg 6, NL-2628 XE Delft, Netherlands

[3] Univ Lancaster, Fylde Coll, Dept Math & Stat, Lancaster LA1 4YF, England

[4] Univ Warwick, Dept Stat, Coventry CV4 7AL, W Midlands, England

来源：

ANNALS OF STATISTICS | 2019年 / 47卷 / 03期

基金：

英国工程与自然科学研究理事会;

关键词：

MCMC; nonreversible Markov process; piecewise deterministic Markov process; stochastic gradient Langevin dynamics; sub-sampling; exact sampling; LONG-TIME BEHAVIOR; VARIANCE REDUCTION; SIMULATION;

D O I：

10.1214/18-AOS1715

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Standard MCMC methods can scale poorly to big data settings due to the need to evaluate the likelihood at each iteration. There have been a number of approximate MCMC algorithms that use sub-sampling ideas to reduce this computational burden, but with the drawback that these algorithms no longer target the true posterior distribution. We introduce a new family of Monte Carlo methods based upon a multidimensional version of the Zig-Zag process of [Ann. Appl. Probab. 27 (2017) 846-882], a continuous-time piecewise deterministic Markov process. While traditional MCMC methods are reversible by construction (a property which is known to inhibit rapid convergence) the Zig-Zag process offers a flexible nonreversible alternative which we observe to often have favourable convergence properties. We show how the Zig-Zag process can be simulated without discretisation error, and give conditions for the process to be ergodic. Most importantly, we introduce a sub-sampling version of the Zig-Zag process that is an example of an exact approximate scheme, that is, the resulting approximate process still has the posterior as its stationary distribution. Furthermore, if we use a control-variate idea to reduce the variance of our unbiased estimator, then the Zig-Zag process can be super-efficient: after an initial preprocessing step, essentially independent samples from the posterior distribution are obtained at a computational cost which does not depend on the size of the data.

引用

页码：1288 / 1320

页数：33

共 14 条

[1] LIMIT THEOREMS FOR THE ZIG-ZAG PROCESS
Bierkens, Joris
Duncan, Andrew
ADVANCES IN APPLIED PROBABILITY, 2017, 49 (03) : 791 - 825
[2] LARGE DEVIATIONS FOR THE EMPIRICAL MEASURE OF THE ZIG-ZAG PROCESS
Bierkens, Joris
Nyquist, Pierre
Schlottke, Mikola C.
ANNALS OF APPLIED PROBABILITY, 2021, 31 (06): : 2811 - 2843
[3] Zig-Zag Sampling for Discrete Structures and Nonreversible Phylogenetic MCMC
Koskela, Jere
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2022, 31 (03) : 684 - 694
[4] A note on the polynomial ergodicity of the one-dimensional Zig-Zag process
VASDEKIS, G. I. O. R. G. O. S.
ROBERTS, G. A. R. E. T. H. O.
JOURNAL OF APPLIED PROBABILITY, 2022, 59 (03) : 895 - 903
[5] Progressive multi-damage analysis of composite laminate using higher order zig-zag plate theory
Wu, Zhaotian
Zhou, Chuwei
ADVANCES IN MECHANICAL ENGINEERING, 2020, 12 (04)
[6] Double-Parallel Monte Carlo for Bayesian analysis of big data
Jingnan Xue
Faming Liang
Statistics and Computing, 2019, 29 : 23 - 32
[7] Double-Parallel Monte Carlo for Bayesian analysis of big data
Xue, Jingnan
Liang, Faming
STATISTICS AND COMPUTING, 2019, 29 (01) : 23 - 32
[8] An efficient monotone data augmentation algorithm for Bayesian analysis of incomplete longitudinal data
Tang, Yongqiang
STATISTICS & PROBABILITY LETTERS, 2015, 104 : 146 - 152
[9] Bayesian analysis of misclassified binomial data: double-sampling and the zero-numerator problem
Al-Kandari, Noriah M.
Garthwaite, Paul H.
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (02) : 334 - 348
[10] Efficient deadline-aware scheduling for the analysis of Big Data streams in public Cloud
Mortazavi-Dehkordi, Mahmood
Zamanifar, Kamran
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (01): : 241 - 263

← 1 2 →