Learning under Model Misspecification: Applications to Variational and Ensemble methods

被引：0

作者：

Masegosa, Andres R. ^{[1
,2
]}

机构：

[1] Univ Almeria, Almeria, Spain

[2] Univ Copenhagen, Copenhagen, Denmark

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020 | 2020年 / 33卷

关键词：

BAYESIAN-INFERENCE; BOUNDS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Virtually any model we use in machine learning to make predictions does not perfectly represent reality. So, most of the learning happens under model misspecification. In this work, we present a novel analysis of the generalization performance of Bayesian model averaging under model misspecification and i.i.d. data using a new family of second-order PAC-Bayes bounds. This analysis shows, in simple and intuitive terms, that Bayesian model averaging provides suboptimal generalization performance when the model is misspecified. In consequence, we provide strong theoretical arguments showing that Bayesian methods are not optimal for learning predictive models, unless the model class is perfectly specified. Using novel second-order PAC-Bayes bounds, we derive a new family of Bayesian-like algorithms, which can be implemented as variational and ensemble methods. The output of these algorithms is a new posterior distribution, different from the Bayesian posterior, which induces a posterior predictive distribution with better generalization performance. Experiments with Bayesian neural networks illustrate these findings.

引用

页数：13

共 61 条

[1] Simpler PAC-Bayesian bounds for hostile data [J].

Alquier, Pierre ;

Guedj, Benjamin .

MACHINE LEARNING, 2018, 107 (05) :887-902

[2]

Alquier P, 2016, J MACH LEARN RES, V17

[3]

[Anonymous], 1994, TEST: An Off. J. Span. Soc. Stat. Oper. Res., DOI DOI 10.1007/BF02562676

[4]

[Anonymous], 1990, Principles and techniques of applied mathematics

[5]

[Anonymous], 2017, ARXIV170311008

[6]

[Anonymous], 2017, INT C MACH LEARN

[7]

Becker R. A., 2012, 2012004 CAEPR

[8]

Bishop C. M., 2006, Pattern Recognition and Machine Learning

[9] A general framework for updating belief distributions [J].

Bissiri, P. G. ;

Holmes, C. C. ;

Walker, S. G. .

JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2016, 78 (05) :1103-1130

[10] Variational Inference: A Review for Statisticians [J].

Blei, David M. ;

Kucukelbir, Alp ;

McAuliffe, Jon D. .

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) :859-877

← 1 2 3 4 5 6 7 →