Cauchy robust principal component analysis with applications to high-dimensional data sets

被引:0
作者
Aisha Fayomi
Yannis Pantazis
Michail Tsagris
Andrew T. A. Wood
机构
[1] King Abdulaziz University,Department of Statistics
[2] Foundation for Research and Technology - Hellas,Institute of Applied and Computational Mathematics
[3] University of Crete,Department of Economics
[4] Australian National University,Research School of Finance, Actuarial Studies & Statistics
来源
Statistics and Computing | 2024年 / 34卷
关键词
Principal component analysis; Robust; Cauchy log-likelihood; High-dimensional data;
D O I
暂无
中图分类号
学科分类号
摘要
Principal component analysis (PCA) is a standard dimensionality reduction technique used in various research and applied fields. From an algorithmic point of view, classical PCA can be formulated in terms of operations on a multivariate Gaussian likelihood. As a consequence of the implied Gaussian formulation, the principal components are not robust to outliers. In this paper, we propose a modified formulation, based on the use of a multivariate Cauchy likelihood instead of the Gaussian likelihood, which has the effect of robustifying the principal components. We present an algorithm to compute these robustified principal components. We additionally derive the relevant influence function of the first component and examine its theoretical properties. Simulation experiments on high-dimensional datasets demonstrate that the estimated principal components based on the Cauchy likelihood typically outperform, or are on a par with, existing robust PCA techniques. Moreover, the Cauchy PCA algorithm we have used has much lower computational cost in very high dimensional settings than the other public domain robust PCA methods we consider.
引用
收藏
相关论文
共 32 条
  • [1] Bolton RJ(1999)A characterization of principal components for projection pursuit Am. Stat. 53 108-109
  • [2] Krzanowski WJ(1980)Robust procedures in multivariate analysis I: robust covariance estimation Appl. Stat. 29 231-237
  • [3] Campbell NA(2011)Robust principal component analysis? J. ACM (JACM) 58 1-37
  • [4] Candès EJ(1975)On the unimodality of the likelihood for the Cauchy distribution Biometrika 62 701-704
  • [5] Li X(2007)Algorithms for projection-pursuit robust principal component analysis Chemom. Intell. Lab. Syst. 87 218-225
  • [6] Ma Y(2013)Robust sparse principal component analysis Technometrics 55 202-214
  • [7] Wright J(1964)Robust estimation of a location parameter Ann. Math. Stat. 35 73-101
  • [8] Copas JB(2003)A robust PCR method for high-dimensional regressors J. Chemom. 17 438-452
  • [9] Croux C(1985)Projection-pursuit approach to robust dispersion matrices and principal components: primary theory and Monte Carlo J. Am. Stat. Assoc. 80 759-766
  • [10] Filzmoser P(1976)Robust M-estimators of multivariate location and scatter Ann. Stat. 4 51-67