A Sequential Monte Carlo Method for Bayesian Analysis of Massive Datasets

被引：0

作者：

Greg Ridgeway

David Madigan

机构：

[1] RAND,Department of Statistics

[2] Rutgers University,undefined

来源：

Data Mining and Knowledge Discovery | 2003年 / 7卷

关键词：

Bayesian inference; massive datasets; Markov chain Monte Carlo; importance sampling; particle filter; mixture model;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Markov chain Monte Carlo (MCMC) techniques revolutionized statistical practice in the 1990s by providing an essential toolkit for making the rigor and flexibility of Bayesian analysis computationally practical. At the same time the increasing prevalence of massive datasets and the expansion of the field of data mining has created the need for statistically sound methods that scale to these large problems. Except for the most trivial examples, current MCMC methods require a complete scan of the dataset for each iteration eliminating their candidacy as feasible data mining techniques.

引用

页码：301 / 319

页数：18

共 35 条

[1]

Besag J.(1995)Bayesian computation and stochastic systems Statistical Science 10 3-41

[2]

Green P.(2002)A sequential particle filter method for static models Biometrika 89 539-552

[3]

Higdon D.(1999)Bayesian data mining in large frequency tables, with an application to the FDA spontaneous reporting system The American Statistician 53 177-190

[4]

Mengersen K.(1984)Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6 721-741

[5]

Chopin N.(2001)Following amoving target—Monte Carlo inference for dynamic Bayesian models Journal of the Royal Statistical Society B 63 127-146

[6]

DuMouchel W.(1998)An Equivalence Between Sparse Approximation and Support Vector Machines Neural Computation 10 1455-1480

[7]

Geman S.(1997)Statistical themes and lessons for data mining Data Mining and Knowledge Discovery 1 11-28

[8]

Geman D.(1970)Monte Carlo sampling methods using Markov Chains and their applications Biometrika 57 97-109

[9]

Gilks W.(1994)Sequential imputation and Bayesian missing data problems Journal of the American Statistical Association 89 278-288

[10]

Berzuini C.(2002)Likelihood-based data squashing: A modeling approach to instance construction Data Mining and Knowledge Discovery 6 173-190

← 1 2 3 4 →