A method for simultaneous variable selection and outlier identification in linear regression

被引:75
|
作者
Hoeting, J
Raftery, AE
Madigan, D
机构
[1] COLORADO STATE UNIV,DEPT STAT,FT COLLINS,CO 80523
[2] UNIV WASHINGTON,DEPT STAT,SEATTLE,WA 98195
基金
美国国家科学基金会;
关键词
Bayesian model averaging; Markov chain Monte Carlo model composition; masking; model uncertainty; posterior model probability;
D O I
10.1016/0167-9473(95)00053-4
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We suggest a method for simultaneous variable selection and outlier identification based on the computation of posterior model probabilities. This avoids the problem that the model you select depends upon the order in which variable selection and outlier identification are carried out. Our method can find multiple outliers and appears to be successful in identifying masked outliers. We also address the problem of model uncertainty via Bayesian model averaging. For problems where the number of models is large, we suggest a Markov chain Monte Carlo approach to approximate the Bayesian model average over the space of all possible variables and outliers under consideration. Software for implementing this approach is described. In an example, we show that model averaging via simultaneous variable selection and outlier identification improves predictive performance and provides more accurate prediction intervals as compared to any single model that might reasonably be selected.
引用
收藏
页码:251 / 270
页数:20
相关论文
共 50 条
  • [21] Simultaneous variable selection for heteroscedastic regression models
    Zhang ZhongZhan
    Wang DaRong
    SCIENCE CHINA-MATHEMATICS, 2011, 54 (03) : 515 - 530
  • [22] Subset selection in multiple linear regression in the presence of outlier and multicollinearity
    Jadhav, Nileshkumar H.
    Kashid, Dattatraya N.
    Kulkarni, Subhash R.
    STATISTICAL METHODOLOGY, 2014, 19 : 44 - 59
  • [23] Simultaneous variable selection and outlier detection using a robust genetic algorithm
    Wiegand, Patrick
    Pell, Randy
    Comas, Enric
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2009, 98 (02) : 108 - 114
  • [24] Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection
    Peng, Yang
    Luo, Bin
    Gao, Xiaoli
    SANKHYA-SERIES B-APPLIED AND INTERDISCIPLINARY STATISTICS, 2022, 84 (02): : 694 - 707
  • [25] Variable selection and transformation in linear regression models
    Yeo, IK
    STATISTICS & PROBABILITY LETTERS, 2005, 72 (03) : 219 - 226
  • [26] Variable Selection in Linear Regression With Many Predictors
    Cai, Airong
    Tsay, Ruey S.
    Chen, Rong
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2009, 18 (03) : 573 - 591
  • [27] Robust Moderately Clipped LASSO for Simultaneous Outlier Detection and Variable Selection
    Yang Peng
    Bin Luo
    Xiaoli Gao
    Sankhya B, 2022, 84 : 694 - 707
  • [28] RESPONSE VARIABLE SELECTION IN MULTIVARIATE LINEAR REGRESSION
    Khare, Kshitij
    Su, Zhihua
    STATISTICA SINICA, 2024, 34 (03) : 1325 - 1345
  • [29] Variable selection in functional linear concurrent regression
    Ghosal, Rahul
    Maity, Arnab
    Clark, Timothy
    Longo, Stefano B.
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2020, 69 (03) : 565 - 587
  • [30] Variable Selection in Multivariate Functional Linear Regression
    Yeh, Chi-Kuang
    Sang, Peijun
    STATISTICS IN BIOSCIENCES, 2023, 17 (1) : 17 - 34