Bayesian model assessment and comparison using cross-validation predictive densities

被引：123

作者：

Vehtari, A ^{[1
]}

Lampinen, J ^{[1
]}

机构：

[1] Aalto Univ, Lab Computat Engn, FIN-02015 Espoo, Finland

来源：

NEURAL COMPUTATION | 2002年 / 14卷 / 10期

关键词：

D O I：

10.1162/08997660260293292

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this work, we discuss practical methods for the assessment, comparison, and selection of complex hierarchical Bayesian models. A natural way to assess the goodness of the model is to estimate its future predictive capability by estimating expected utilities. Instead of just making a point estimate, it is important to obtain the distribution of the expected utility estimate because it describes the uncertainty in the estimate. The distributions of the expected utility estimates can also be used to compare models, for example, by computing the probability of one model having a better expected utility than some other model. We propose an approach using cross-validation predictive densities to obtain expected utility estimates and Bayesian bootstrap to obtain samples from their distributions. We also discuss the probabilistic assumptions made and properties of two practical cross-validation methods, importance sampling and k-fold cross-validation. As illustrative examples, we use multilayer perceptron neural networks and gaussian processes with Markov chain Monte Carlo sampling in one toy problem and two challenging real-world problems.

引用

页码：2439 / 2468

页数：30

共 64 条

[1]

AITKIN M, 1991, J ROY STAT SOC B MET, V53, P111

[2]

[Anonymous], 1998, LEARNING GRAPHICAL M

[3]

Bernardo J.M., 2009, Bayesian Theory, V405

[4] EXPECTED INFORMATION AS EXPECTED UTILITY [J].

BERNARDO, JM .

ANNALS OF STATISTICS, 1979, 7 (03) :686-690

[5]

Breiman L., 1984, BIOMETRICS, DOI DOI 10.2307/2530946

[6]

BURMAN P, 1994, BIOMETRIKA, V81, P351, DOI 10.1093/biomet/81.2.351

[7] A COMPARATIVE-STUDY OF ORDINARY CROSS-VALIDATION, NU-FOLD CROSS-VALIDATION AND THE REPEATED LEARNING-TESTING METHODS [J].

BURMAN, P .

BIOMETRIKA, 1989, 76 (03) :503-514

[8]

CARLIN BP, 1995, J ROY STAT SOC B MET, V57, P473

[9]

Chen M. H., 2000, MONTE CARLO METHODS

[10] Approximate statistical tests for comparing supervised classification learning algorithms [J].

Dietterich, TG .

NEURAL COMPUTATION, 1998, 10 (07) :1895-1923

← 1 2 3 4 5 6 7 →