Bayesian Model Selection, the Marginal Likelihood, and Generalization

被引：0

作者：

Lotfi, Sanae ^{[1
]}

Izmailov, Pavel ^{[1
]}

Benton, Gregory ^{[1
]}

Goldblum, Micah ^{[1
]}

Wilson, Andrew Gordon ^{[1
]}

机构：

[1] NYU, New York, NY 10003 USA

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162 | 2022年

关键词：

CHOICE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

How do we compare between hypotheses that are entirely consistent with observations? The marginal likelihood (aka Bayesian evidence), which represents the probability of generating our observations from a prior, provides a distinctive approach to this foundational question, automatically encoding Occam's razor. Although it has been observed that the marginal likelihood can overfit and is sensitive to prior assumptions, its limitations for hyperparameter learning and discrete model comparison have not been thoroughly investigated. We first revisit the appealing properties of the marginal likelihood for learning constraints and hypothesis testing. We then highlight the conceptual and practical issues in using the marginal likelihood as a proxy for generalization. Namely, we show how marginal likelihood can be negatively correlated with generalization, with implications for neural architecture search, and can lead to both underfitting and overfitting in hyperparameter learning. We provide a partial remedy through a conditional marginal likelihood, which we show is more aligned with generalization, and practically valuable for large-scale hyperparameter learning, such as in deep kernel learning.

引用

页数：25

共 50 条

[41] "KLICing" there and back again: Portfolio selection using the empirical likelihood divergence and Hellinger distance
Haley, M. Ryan
McGee, M. Kevin
JOURNAL OF EMPIRICAL FINANCE, 2011, 18 (02) : 341 - 352
[42] Bayesian model selection for COVID-19 pandemic state estimation using extended Kalman filters: Case study for Saudi Arabia
Alyami, Lamia
Das, Saptarshi
Townley, Stuart
PLOS GLOBAL PUBLIC HEALTH, 2024, 4 (07):
[43] Generalization across dimensions: A model for three-alternative choice
Davison, Michael
Cowie, Sarah
JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR, 2025,
[44] Approximating Bayesian Inference through Model Simulation
Turner, Brandon M.
Van Zandt, Trisha
TRENDS IN COGNITIVE SCIENCES, 2018, 22 (09) : 826 - 840
[45] Bayesian model comparison: Log scores and DIC
Krnjajic, Milovan
Draper, David
STATISTICS & PROBABILITY LETTERS, 2014, 88 : 9 - 14
[46] Biases in the Maximum Simulated Likelihood Estimation of the Mixed Logit Model
Jumamyradov, Maksat
Munkin, Murat
Greene, William H.
Craig, Benjamin M.
ECONOMETRICS, 2024, 12 (02)
[47] A Bayesian generalized rank ordered logit model
Cheng, Haotian
Ng'ombe, John N.
Lambert, Dayton M.
JOURNAL OF CHOICE MODELLING, 2024, 50
[48] Bayesian analysis of the piecewise diffusion decision model
Holmes, William R.
Trueblood, Jennifer S.
BEHAVIOR RESEARCH METHODS, 2018, 50 (02) : 730 - 743
[49] A Comparative Study of Bayesian and Frequentist Testing for Model Comparison in Nested Model
Ojo, Oluwadare Olatunde
THAILAND STATISTICIAN, 2021, 19 (03): : 583 - 592
[50] A Bayesian Vector Multidimensional Scaling Procedure Incorporating Dimension Reparameterization with Variable Selection
Fong, Duncan K. H.
DeSarbo, Wayne S.
Chen, Zhe
Xu, Zhuying
PSYCHOMETRIKA, 2015, 80 (04) : 1043 - 1065

← 1 2 3 4 5 →