Robust and Accurate Inference via a Mixture of Gaussian and Student's t Errors

被引:9
作者
Tak, Hyungsuk [1 ]
Ellis, Justin A. [2 ]
Ghosh, Sujit K. [3 ]
机构
[1] Univ Notre Dame, Dept Appl & Computat Math & Stat, Notre Dame, IN 46556 USA
[2] Infinia ML, Durham, NC USA
[3] North Carolina State Univ, Dept Stat, Raleigh, NC USA
关键词
Gaussian process; Gibbs sampling; Hierarchical model; Huber's M-estimator; Linear mixed model; Outlier; Time series; SCALE MIXTURES; MODELS; VARIABILITY; REGRESSION; QUASARS;
D O I
10.1080/10618600.2018.1537925
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A Gaussian measurement error assumption, that is, an assumption that the data are observed up to Gaussian noise, can bias any parameter estimation in the presence of outliers. A heavy tailed error assumption based on Student's t distribution helps reduce the bias. However, it may be less efficient in estimating parameters if the heavy tailed assumption is uniformly applied to all of the data when most of them are normally observed. We propose a mixture error assumption that selectively converts Gaussian errors into Student's t errors according to latent outlier indicators, leveraging the best of the Gaussian and Student's t errors; a parameter estimation can be not only robust but also accurate. Using simulated hospital profiling data and astronomical time series of brightness data, we demonstrate the potential for the proposed mixture error assumption to estimate parameters accurately in the presence of outliers. Supplemental materials for this article are available online.
引用
收藏
页码:415 / 426
页数:12
相关论文
共 37 条
  • [1] MIXTURE-MODELS, OUTLIERS, AND THE EM ALGORITHM
    AITKIN, M
    WILSON, GT
    [J]. TECHNOMETRICS, 1980, 22 (03) : 325 - 331
  • [2] ANDREWS DF, 1974, J ROY STAT SOC B MET, V36, P99
  • [3] Berger J. O., 1985, Statistical Decision Theory and Bayesian Analysis, V2, DOI 10.1007/978-1-4757-4286-2
  • [4] Bhatia K., 2016, 160700146 ARXIV
  • [5] High-breakdown inference for mixed linear models
    Copt, S
    Victoria-Feser, MP
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) : 292 - 300
  • [6] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [7] Variability-selected quasars in MACHO project Magellanic Cloud fields
    Geha, M
    Alcock, C
    Allsman, RA
    Alves, DR
    Axelrod, TS
    Becker, AC
    Bennett, DP
    Cook, KH
    Drake, AJ
    Freeman, KC
    Griest, K
    Keller, SC
    Lehner, MJ
    Marshall, SL
    Minniti, D
    Nelson, CA
    Peterson, BA
    Popowski, P
    Pratt, MR
    Quinn, PJ
    Stubbs, CW
    Sutherland, W
    Tomaney, AB
    Vandehei, T
    Welch, DL
    [J]. ASTRONOMICAL JOURNAL, 2003, 125 (01) : 1 - 12
  • [8] GELMAN A., 2014, Texts in Statistical Science Series, V3rd
  • [9] The mixtures of Student's t-distributions as a robust framework for rigid registration
    Gerogiannis, Demetrios
    Nikou, Christophoros
    Likas, Aristidis
    [J]. IMAGE AND VISION COMPUTING, 2009, 27 (09) : 1285 - 1294
  • [10] A class of robust and fully efficient regression estimators
    Gervini, D
    Yohai, VJ
    [J]. ANNALS OF STATISTICS, 2002, 30 (02) : 583 - 616