Sparse PCA for High-Dimensional Data With Outliers

被引:50
作者
Hubert, Mia [1 ]
Reynkens, Tom [1 ]
Schmitt, Eric [1 ]
Verdonck, Tim [1 ]
机构
[1] Katholieke Univ Leuven, Dept Math, Leuven, Belgium
关键词
Dimension reduction; Outlier detection; Robustness; PROJECTION-PURSUIT APPROACH; PRINCIPAL COMPONENTS; ROBUST PCA;
D O I
10.1080/00401706.2015.1093962
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
A new sparse PCA algorithm is presented, which is robust against outliers. The approach is based on the ROBPCA algorithm that generates robust but nonsparse loadings. The construction of the new ROSPCA method is detailed, as well as a selection criterion for the sparsity parameter. An extensive simulation study and a real data example are performed, showing that it is capable of accurately finding the sparse structure of datasets, even when challenging outliers are present. In comparison with a projection pursuit-based algorithm, ROSPCA demonstrates superior robustness properties and comparable sparsity estimation capability, as well as significantly faster computation time.
引用
收藏
页码:424 / 434
页数:11
相关论文
共 28 条
[1]   NUMERICAL METHODS FOR COMPUTING ANGLES BETWEEN LINEAR SUBSPACES [J].
BJORCK, A ;
GOLUB, GH .
MATHEMATICS OF COMPUTATION, 1973, 27 (123) :579-594
[2]   Robust classification in high dimensions based on the SIMCA method [J].
Branden, KV ;
Hubert, M .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2005, 79 (1-2) :10-21
[3]   LOADINGS AND CORRELATIONS IN THE INTERPRETATION OF PRINCIPAL COMPONENTS [J].
CADIMA, J ;
JOLLIFFE, IT .
JOURNAL OF APPLIED STATISTICS, 1995, 22 (02) :203-214
[4]   Robust Principal Component Analysis? [J].
Candes, Emmanuel J. ;
Li, Xiaodong ;
Ma, Yi ;
Wright, John .
JOURNAL OF THE ACM, 2011, 58 (03)
[5]   Algorithms for Projection - Pursuit robust principal component analysis [J].
Croux, C. ;
Filzmoser, P. ;
Oliveira, M. R. .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2007, 87 (02) :218-225
[6]   High breakdown estimators for principal components: the projection-pursuit approach revisited [J].
Croux, C ;
Ruiz-Gazen, A .
JOURNAL OF MULTIVARIATE ANALYSIS, 2005, 95 (01) :206-226
[7]   Robust Sparse Principal Component Analysis [J].
Croux, Christophe ;
Filzmoser, Peter ;
Fritz, Heinrich .
TECHNOMETRICS, 2013, 55 (02) :202-214
[8]   The influence function of the Stahel-Donoho covariance estimator of smallest outlyingness [J].
Debruyne, M. ;
Hubert, M. .
STATISTICS & PROBABILITY LETTERS, 2009, 79 (03) :275-282
[9]  
Engelen S, 2005, AUST J STAT, V34, P117
[10]  
Filzmoser P, 2014, PCAPP ROBUST PCA PRO