Adaptive Principal Component Analysis

被引:0
作者
Li, Xiangyu [1 ]
Wang, Hua [1 ]
机构
[1] Colorado Sch Mines, Dept Comp Sci, Golden, CO 80401 USA
来源
PROCEEDINGS OF THE 2022 SIAM INTERNATIONAL CONFERENCE ON DATA MINING, SDM | 2022年
基金
美国国家科学基金会;
关键词
Principal Component Analysis; Adaptive Loss Function; Proximal Alternating Linearized Minimization; Robustness; ESTIMATORS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Principal Component Analysis (PCA) is a powerful approach that can reduce the dimensionality of input data and increase the ease of data interpretation. One major obstacle for using the traditional PCA is its dependence on the squared l(2)-norm distance that is highly sensitive to outliers and noisy data. This significantly limits the applicability of the PCA when it is applied to large or high-dimensional datasets. In this paper, we review previous efforts of how to overcome this problem and further propose a novel adaptive PCA model. One broadly used hypothesis is that small losses of most data follow the Gaussian distribution and large losses of a few data follow the Laplacian distribution. With this recognition, we choose to use the adaptive loss function that integrates both l(1)-norm and squared l(2)-norm distances into a single loss function. To solve the resulted non-convex optimization problem, we derive an efficient iterative algorithm that guarantees that both the objective and solution sequences converge to global optimal solutions at a sub-linear convergence rate. The theoretical convergence analysis of our model is not trivial and comprehensively presented in our work. In comparison to other commonly used dimensionality reduction methods, our experiments on seven real-world datasets showed a great advantage of our model regardless of whether the data was noisy or not. We thus anticipate that our method will have broad usage for a wide range of real-world applications.
引用
收藏
页码:486 / 494
页数:9
相关论文
共 38 条
[1]   Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality [J].
Attouch, Hedy ;
Bolte, Jerome ;
Redont, Patrick ;
Soubeyran, Antoine .
MATHEMATICS OF OPERATIONS RESEARCH, 2010, 35 (02) :438-457
[2]   THE LOJASIEWICZ INEQUALITY FOR NONSMOOTH SUBANALYTIC FUNCTIONS WITH APPLICATIONS TO SUBGRADIENT DYNAMICAL SYSTEMS [J].
Bolte, Jerome ;
Daniilidis, Aris ;
Lewis, Adrian .
SIAM JOURNAL ON OPTIMIZATION, 2007, 17 (04) :1205-1223
[3]   Robust Principal Component Analysis? [J].
Candes, Emmanuel J. ;
Li, Xiaodong ;
Ma, Yi ;
Wright, John .
JOURNAL OF THE ACM, 2011, 58 (03)
[4]   High breakdown estimators for principal components: the projection-pursuit approach revisited [J].
Croux, C ;
Ruiz-Gazen, A .
JOURNAL OF MULTIVARIATE ANALYSIS, 2005, 95 (01) :206-226
[5]   Principal component analysis based on robust estimators of the covariance or correlation matrix: Influence functions and efficiencies [J].
Croux, C ;
Haesbroeck, G .
BIOMETRIKA, 2000, 87 (03) :603-618
[6]   A Framework for Robust Subspace Learning [J].
Fernando De la Torre ;
Michael J. Black .
International Journal of Computer Vision, 2003, 54 (1-3) :117-142
[7]  
Dhanaraj M, 2018, EUR SIGNAL PR CONF, P2020, DOI 10.23919/EUSIPCO.2018.8553239
[8]  
Ding C., 2006, 23 INT C MACH LEARN, P281
[9]   High dimensional data classification and feature selection using support vector machines [J].
Ghaddar, Bissan ;
Naoum-Sawaya, Joe .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2018, 265 (03) :993-1004
[10]   Principal component analysis: a review and recent developments [J].
Jolliffe, Ian T. ;
Cadima, Jorge .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2016, 374 (2065)