Summary statistics and discrepancy measures for approximate Bayesian computation via surrogate posteriors

被引:0
作者
Florence Forbes
Hien Duy Nguyen
TrungTin Nguyen
Julyan Arbel
机构
[1] University of Grenoble Alpes,School of Mathematics and Physics
[2] Inria,undefined
[3] CNRS,undefined
[4] Grenoble INP,undefined
[5] LJK,undefined
[6] Inria Grenoble Rhone-Alpes,undefined
[7] University of Queensland,undefined
[8] Normandie University,undefined
[9] UNICAEN,undefined
[10] CNRS,undefined
[11] LMNO,undefined
来源
Statistics and Computing | 2022年 / 32卷
关键词
Approximate Bayesian computation; Summary statistics; Surrogate models; Gaussian mixtures; Wasserstein distance; Multimodal posterior distributions;
D O I
暂无
中图分类号
学科分类号
摘要
A key ingredient in approximate Bayesian computation (ABC) procedures is the choice of a discrepancy that describes how different the simulated and observed data are, often based on a set of summary statistics when the data cannot be compared directly. Unless discrepancies and summaries are available from experts or prior knowledge, which seldom occurs, they have to be chosen, and thus their choice can affect the quality of approximations. The choice between discrepancies is an active research topic, which has mainly considered data discrepancies requiring samples of observations or distances between summary statistics. In this work, we introduce a preliminary learning step in which surrogate posteriors are built from finite Gaussian mixtures using an inverse regression approach. These surrogate posteriors are then used in place of summary statistics and compared using metrics between distributions in place of data discrepancies. Two such metrics are investigated: a standard L2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$_2$$\end{document} distance and an optimal transport-based distance. The whole procedure can be seen as an extension of the semi-automatic ABC framework to a functional summary statistics setting and can also be used as an alternative to sample-based approaches. The resulting ABC quasi-posterior distribution is shown to converge to the true one, under standard conditions. Performance is illustrated on both synthetic and real data sets, where it is shown that our approach is particularly useful when the posterior is multimodal.
引用
收藏
相关论文
共 156 条
  • [1] An Z(2020)Robust Bayesian synthetic likelihood via a semi-parametric approach Stat. Comput. 30 543-557
  • [2] Nott DJ(2019)Accelerating Bayesian synthetic likelihood with the graphical lasso J. Comput. Gr. Stat. 28 471-475
  • [3] Drovandi C(2019)Solving inverse problems using data-driven models Acta Numer 28 1-174
  • [4] An Z(2019)Approximate Bayesian computation with the Wasserstein distance J. R. Stat. Soc. Ser. B (Stat. Methodol.) 81 235-269
  • [5] South LF(2013)A comparative review of dimension reduction methods in approximate Bayesian computation Stat. Sci. 28 189-208
  • [6] Nott DJ(2021)Bayesian inverse regression for vascular magnetic resonance fingerprinting IEEE Trans. Med. Imaging 40 1827-1837
  • [7] Drovandi CC(2019)Improving approximate Bayesian computation via quasi-monte Carlo J. Comput. Graph. Stat. 28 205-219
  • [8] Arridge S(2009)On-line expectation-maximization algorithm for latent data models J. R. Stat. Soc. B 71 593-613
  • [9] Maass P(2019)Optimal transport for gaussian mixture models IEEE Access 7 6269-6278
  • [10] Öktem O(2019)Partial least squares prediction in high-dimensional regression Ann. Stat. 47 884-908