Robust sparse regression by modeling noise as a mixture of gaussians

被引：4

作者：

Xu, Shuang ^{[1
]}

Zhang, Chun-Xia ^{[1
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Shaanxi, Peoples R China

来源：

JOURNAL OF APPLIED STATISTICS | 2019年 / 46卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Robust regression; penalized regression; variable selection; mixture of Gaussians; lasso; VARIABLE SELECTION; REGULARIZATION; SHRINKAGE; ALGORITHM;

D O I：

10.1080/02664763.2019.1566448

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Regression analysis has been proven to be a quite effective tool in a large variety of fields. In many regression models, it is often assumed that noise is with a specific distribution. Although the theoretical analysis can be greatly facilitated, the model-fitting performance may be poor since the supposed noise distribution may deviate from real noise to a large extent. Meanwhile, the model is also expected to be robust in consideration of the complexity of real-world data. Without any assumption about noise, we propose in this paper a novel sparse regression method called MoG-Lasso to directly model noise in linear regression models via a mixture of Gaussian distributions (MoG). Meanwhile, the penalty is included as a part of the loss function of MoG-Lasso to enhance its ability to identify a sparse model. As for the parameters in MoG-Lasso, we present an efficient algorithm to estimate them via the EM (expectation maximization) and ADMM (alternating direction method of multipliers) algorithms. With some simulated and real data contaminated by complex noise, the experiments show that the novel model MoG-Lasso performs better than several other popular methods in both 'p>n' and 'p<n' situations, including Lasso, LAD-Lasso and Huber-Lasso.

引用

页码：1738 / 1755

页数：18

共 50 条

[31] Finite mixture regression: A sparse variable selection by model selection for clustering
Devijver, Emilie
ELECTRONIC JOURNAL OF STATISTICS, 2015, 9 (02): : 2642 - 2674
[32] Robust and sparse regression in generalized linear model by stochastic optimization
Kawashima, Takayuki
Fujisawa, Hironori
JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2019, 2 (02) : 465 - 489
[33] The adaptive BerHu penalty in robust regression
Lambert-Lacroix, Sophie
Zwald, Laurent
JOURNAL OF NONPARAMETRIC STATISTICS, 2016, 28 (03) : 487 - 514
[34] A fast robust best subset regression
Ming, Hao
Yang, Hu
KNOWLEDGE-BASED SYSTEMS, 2024, 284
[35] Penalized Sparse Covariance Regression with High Dimensional Covariates
Gao, Yuan
Zhang, Zhiyuan
Cai, Zhanrui
Zhu, Xuening
Zou, Tao
Wang, Hansheng
JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2024,
[36] Confidence Intervals for Sparse Penalized Regression With Random Designs
Yu, Guan
Yin, Liang
Lu, Shu
Liu, Yufeng
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2020, 115 (530) : 794 - 809
[37] Aggregated hold out for sparse linear regression with a robust loss function
Maillard, Guillaume
ELECTRONIC JOURNAL OF STATISTICS, 2022, 16 (01): : 935 - 997
[38] Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data
Bastien, Philippe
Bertrand, Frederic
Meyer, Nicolas
Maumy-Bertrand, Myriam
BIOINFORMATICS, 2015, 31 (03) : 397 - 404
[39] SPARSE LEAST TRIMMED SQUARES REGRESSION FOR ANALYZING HIGH-DIMENSIONAL LARGE DATA SETS
Alfons, Andreas
Croux, Christophe
Gelper, Sarah
ANNALS OF APPLIED STATISTICS, 2013, 7 (01) : 226 - 248
[40] A Phylogeny-Regularized Sparse Regression Model for Predictive Modeling of Microbial Community Data
Xiao, Jian
Chem, Li
Yu, Yue
Zhang, Xianyang
Chen, Jun
FRONTIERS IN MICROBIOLOGY, 2018, 9

← 1 2 3 4 5 →