Fast Hamiltonian Monte Carlo Using GPU Computing

被引:12
作者
Beam, Andrew L. [1 ]
Ghosh, Sujit K. [2 ,3 ]
Doyle, Jon [4 ]
机构
[1] Harvard Univ, Sch Med, Ctr Biomed Informat, Boston, MA 02115 USA
[2] N Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
[3] SAMSI, Res Triangle Pk, NC 27709 USA
[4] N Carolina State Univ, Comp Sci, SAS Inst, Raleigh, NC 27695 USA
关键词
GPU; Hamiltonian Monte Carlo; MCMC; Multinomial regression; MARKOV-CHAINS; REGRESSION; MODELS; LASSO;
D O I
10.1080/10618600.2015.1035724
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In recent years, the Hamiltonian Monte Carlo (HMC) algorithm has been found to work more efficiently compared to other popular Markov chain Monte Carlo (MCMC) methods (such as random walk Metropolis Hastings) in generating samples from a high-dimensional probability distribution. HMC has proven more efficient in terms of mixing rates and effective sample size than previous MCMC techniques, but still may not be sufficiently fast for particularly large problems. The use of GPUs promises to push HMC even further greatly increasing the utility of the algorithm. By expressing the computationally intensive portions of HMC (the evaluations of the probability kernel and its gradient) in terms of linear or element-wise operations, HMC can be made highly amenable to the use of graphics processing units (GPUs). A multinomial regression example demonstrates the promise of GPU-based HMC sampling. Using GPU-based memory objects to perform the entire HMC simulation, most of the latency penalties associated with transferring data from main to GPU memory can be avoided. Thus, the proposed computational framework may appear conceptually very simple, but has the potential to be applied to a wide class of hierarchical models relying on HMC sampling. Models whose posterior density and corresponding gradients can be reduced to linear or element-wise operations are amenable to significant speed ups through the use of GPUs. Analyses of datasets that were previously intractable for fully Bayesian approaches due to the prohibitively high computational cost are now feasible using the proposed framework.
引用
收藏
页码:536 / 548
页数:13
相关论文
共 26 条
[1]  
[Anonymous], INT C ART NEUR NETW
[2]  
[Anonymous], 2013, Stan: A C++ Library for Probability and Sampling
[3]  
Berger J., 2010, Proceedings of the Python for Scientific Computing Conference (SciPy), number Scipy, P1
[4]  
Davis PJ, 2007, METHODS NUMERICAL IN
[5]   HYBRID MONTE-CARLO [J].
DUANE, S ;
KENNEDY, AD ;
PENDLETON, BJ ;
ROWETH, D .
PHYSICS LETTERS B, 1987, 195 (02) :216-222
[6]   Regularization Paths for Generalized Linear Models via Coordinate Descent [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Rob .
JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01) :1-22
[7]   A WEAKLY INFORMATIVE DEFAULT PRIOR DISTRIBUTION FOR LOGISTIC AND OTHER REGRESSION MODELS [J].
Gelman, Andrew ;
Jakulin, Aleks ;
Pittau, Maria Grazia ;
Su, Yu-Sung .
ANNALS OF APPLIED STATISTICS, 2008, 2 (04) :1360-1383
[8]   Large-scale Bayesian logistic regression for text categorization [J].
Genkin, Alexander ;
Lewis, David D. ;
Madigan, David .
TECHNOMETRICS, 2007, 49 (03) :291-304
[9]   Riemann manifold Langevin and Hamiltonian Monte Carlo methods [J].
Girolami, Mark ;
Calderhead, Ben .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2011, 73 :123-214
[10]   Maximum likelihood and the bootstrap for nonlinear dynamic models [J].
Gonçalves, S ;
White, H .
JOURNAL OF ECONOMETRICS, 2004, 119 (01) :199-219