CR-Lasso: Robust cellwise regularized sparse regression

被引:3
作者
Su, Peng [1 ]
Tarr, Garth [1 ]
Muller, Samuel [1 ,2 ]
Wang, Suojin [3 ]
机构
[1] Univ Sydney, Sch Math & Stat, Sydney, NSW 2006, Australia
[2] Macquarie Univ, Sch Math & Phys Sci, Sydney, NSW 2109, Australia
[3] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
基金
澳大利亚研究理事会;
关键词
Cellwise contamination; Cellwise regularization; Robust sparse regression; Feature selection; SELECTION; ESTIMATOR; MODEL;
D O I
10.1016/j.csda.2024.107971
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Cellwise contamination remains a challenging problem for data scientists, particularly in research fields that require the selection of sparse features. Traditional robust methods may not be feasible nor efficient in dealing with such contaminated datasets. A robust Lasso -type cellwise regularization procedure is proposed which is coined CR -Lasso, that performs feature selection in the presence of cellwise outliers by minimising a regression loss and cell deviation measure simultaneously. The evaluation of this approach involves simulation studies that compare its selection and prediction performance with several sparse regression methods. The results demonstrate that CR -Lasso is competitive within the considered settings. The effectiveness of the proposed method is further illustrated through an analysis of a bone mineral density dataset.
引用
收藏
页数:14
相关论文
共 43 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
Alfons A., 2021, J OPEN SOURCE SOFTW, V6, DOI [10.21105/joss.03786, DOI 10.21105/JOSS.03786]
[3]   SPARSE LEAST TRIMMED SQUARES REGRESSION FOR ANALYZING HIGH-DIMENSIONAL LARGE DATA SETS [J].
Alfons, Andreas ;
Croux, Christophe ;
Gelper, Sarah .
ANNALS OF APPLIED STATISTICS, 2013, 7 (01) :226-248
[4]   PROPAGATION OF OUTLIERS IN MULTIVARIATE DATA [J].
Alqallaf, Fatemah ;
Van Aelst, Stefan ;
Yohai, Victor J. ;
Zamar, Ruben H. .
ANNALS OF STATISTICS, 2009, 37 (01) :311-331
[5]  
[Anonymous], 1986, Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics
[6]   VARIABLE SELECTION FOR BART: AN APPLICATION TO GENE REGULATION [J].
Bleich, Justin ;
Kapelner, Adam ;
George, Edward I. ;
Jensen, Shane T. .
ANNALS OF APPLIED STATISTICS, 2014, 8 (03) :1750-1781
[7]   Sparse regression for large data sets with outliers [J].
Bottmer, Lea ;
Croux, Christophe ;
Wilms, Ines .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 297 (02) :782-794
[8]   AN ANALYSIS OF TRANSFORMATIONS [J].
BOX, GEP ;
COX, DR .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1964, 26 (02) :211-252
[9]   Robust Lasso Regression Using Tukey's Biweight Criterion [J].
Chang, Le ;
Roberts, Steven ;
Welsh, Alan .
TECHNOMETRICS, 2018, 60 (01) :36-47
[10]  
Chen Yudong, 2013, P 30 INT C INT C MAC, P774