Robust logistic zero-sum regression for microbiome compositional data

被引:13
作者
Monti, G. S. [1 ]
Filzmoser, P. [2 ]
机构
[1] Univ Milano Bicocca, Dept Econ Management & Stat, Milan, Italy
[2] Vienna Univ Technol, Inst Stat & Math Methods Econ, Vienna, Austria
关键词
Robustness; High dimensional data; Metagenomics; Penalized estimation; VARIABLE SELECTION; REGULARIZATION; ESTIMATOR; MODELS;
D O I
10.1007/s11634-021-00465-4
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We introduce the Robust Logistic Zero-Sum Regression (RobLZS) estimator, which can be used for a two-class problem with high-dimensional compositional covariates. Since the log-contrast model is employed, the estimator is able to do feature selection among the compositional parts. The proposed method attains robustness by minimizing a trimmed sum of deviances. A comparison of the performance of the Rob-LZS estimator with a non-robust counterpart and with other sparse logistic regression estimators is conducted via Monte Carlo simulation studies. Two microbiome data applications are considered to investigate the stability of the estimators to the presence of outliers. Robust Logistic Zero-Sum Regression is available as an R package that can be downloaded at https://github.com/giannamonti/RobZS.
引用
收藏
页码:301 / 324
页数:24
相关论文
共 29 条
  • [1] AITCHISON J, 1984, BIOMETRIKA, V71, P323
  • [2] AITCHISON J, 1982, J ROY STAT SOC B, V44, P139
  • [3] ALBERT A, 1984, BIOMETRIKA, V71, P1
  • [4] SPARSE LEAST TRIMMED SQUARES REGRESSION FOR ANALYZING HIGH-DIMENSIONAL LARGE DATA SETS
    Alfons, Andreas
    Croux, Christophe
    Gelper, Sarah
    [J]. ANNALS OF APPLIED STATISTICS, 2013, 7 (01) : 226 - 248
  • [5] Reference point insensitive molecular data analysis
    Altenbuchinger, M.
    Rehberg, T.
    Zacharias, H. U.
    Staemmler, F.
    Dettmer, K.
    Weber, D.
    Hiergeist, A.
    Gessner, A.
    Holler, E.
    Oefner, P. J.
    Spang, R.
    [J]. BIOINFORMATICS, 2017, 33 (02) : 219 - 226
  • [6] Robust and consistent variable selection in high-dimensional generalized linear models
    Avella-Medina, Marco
    Ronchetti, Elvezio
    [J]. BIOMETRIKA, 2018, 105 (01) : 31 - 44
  • [7] Log-ratio lasso: Scalable, sparse estimation for log-ratio models
    Bates, Stephen
    Tibshirani, Robert
    [J]. BIOMETRICS, 2019, 75 (02) : 613 - 624
  • [8] Bianco AM., 1996, HONOR PETER HUBERS 6, P17
  • [9] Implementing the Bianco and Yohai estimator for logistic regression
    Croux, C
    Haesbroeck, G
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2003, 44 (1-2) : 273 - 295
  • [10] Predictive analysis methods for human microbiome data with application to Parkinson's disease
    Dong, Mei
    Li, Longhai
    Chen, Man
    Kusalik, Anthony
    Xu, Wei
    [J]. PLOS ONE, 2020, 15 (08):