Identifying influential observations in Bayesian models by using Markov chain Monte Carlo

被引:11
作者
Jackson, Dan [1 ]
White, Ian R. [1 ]
Carpenter, James [2 ]
机构
[1] MRC Biostat Unit, Cambridge, England
[2] London Sch Hyg & Trop Med, London, England
关键词
Bayesian methods; generalised linear models; influence; Markov chain Monte Carlo; DIAGNOSTICS; REGRESSION;
D O I
10.1002/sim.4356
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
In statistical modelling, it is often important to know how much parameter estimates are influenced by particular observations. An attractive approach is to re-estimate the parameters with each observation deleted in turn, but this is computationally demanding when fitting models by using Markov chain Monte Carlo (MCMC), as obtaining complete sample estimates is often in itself a very time-consuming task. Here we propose two efficient ways to approximate the case-deleted estimates by using output from MCMC estimation. Our first proposal, which directly approximates the usual influence statistics in maximum likelihood analyses of generalised linear models (GLMs), is easy to implement and avoids any further evaluation of the likelihood. Hence, unlike the existing alternatives, it does not become more computationally intensive as the model complexity increases. Our second proposal, which utilises model perturbations, also has this advantage and does not require the form of the GLM to be specified. We show how our two proposed methods are related and evaluate them against the existing method of importance sampling and case deletion in a logistic regression analysis with missing covariates. We also provide practical advice for those implementing our procedures, so that they may be used in many situations where MCMC is used to fit statistical models. Copyright (C) 2011 John Wiley & Sons, Ltd.
引用
收藏
页码:1238 / 1248
页数:11
相关论文
共 13 条
[1]  
[Anonymous], 2006, MARKOV CHAIN MONTE C
[2]  
[Anonymous], 2021, Bayesian data analysis
[3]   Case influence analysis in Bayesian inference [J].
Bradlow, ET ;
Zaslavsky, AM .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 1997, 6 (03) :314-331
[4]   Repeat sudden unexpected and unexplained infant deaths: natural or unnatural? [J].
Carpenter, RG ;
Waite, A ;
Coombs, RC ;
Daman-Willems, C ;
McKenzie, A ;
Huber, J ;
Emery, JL .
LANCET, 2005, 365 (9453) :29-35
[5]   DETECTION OF INFLUENTIAL OBSERVATION IN LINEAR-REGRESSION [J].
COOK, RD .
TECHNOMETRICS, 1977, 19 (01) :15-18
[6]   Some algebra and geometry for hierarchical models, applied to diagnostics [J].
Hodges, JS .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1998, 60 :497-521
[7]  
Lee Y., 2006, GEN LINEAR MODELS RA, DOI DOI 10.1201/9781315119953
[8]   Posterior bimodality in the balanced one-way random-effects model [J].
Liu, JN ;
Hodges, JS .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2003, 65 :247-255
[9]   WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility [J].
Lunn, DJ ;
Thomas, A ;
Best, N ;
Spiegelhalter, D .
STATISTICS AND COMPUTING, 2000, 10 (04) :325-337
[10]  
Nelder JA, 1999, GEN LINEAR MODELS