A preliminary geometric structure simplification for Principal Component Analysis

被引:2
作者
Gu, Huamao [1 ]
Lin, Tong [2 ]
Wang, Xun [1 ]
机构
[1] Zhejiang Gongshang Univ, Sch Comp Sci & Informat Engn, Hangzhou 310018, Zhejiang, Peoples R China
[2] Peking Univ, Sch Elect Engn & Comp Sci, State Key Lab Machine Percept, Beijing 100871, Peoples R China
基金
美国国家科学基金会;
关键词
Data preprocessing; PCA; Geometric structure; DIMENSIONALITY REDUCTION; EIGENMAPS;
D O I
10.1016/j.neucom.2018.05.119
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real world data are commonly geometrically nonlinear and thus are not easy to be processed by the traditional linear methods. Many existing techniques for nonlinear dimensionality reduction need careful parameter tuning and cannot be applied to real data stably and consistently. In this article we propose an efficient data preprocessing algorithm, called Curve Straightening Transformation (CST), to flatten the nonlinear geometric structure of data. Then Principal Component Analysis (PCA) and other linear projection methods are adequate to perform the dimensionality reduction task in most cases. In this aspect, the proposed CST algorithm can be regarded as a geometric preprocessing step tailored for PCA. The comprehensive experiments on both artificial and real datasets demonstrate that the proposed preprocessing algorithm is able to simplify the nonlinear geometric structures, and the flattened data are suitable for further dimensionality reduction by linear methods such as PCA. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:46 / 55
页数:10
相关论文
共 23 条
[11]   Riemannian manifold learning [J].
Lin, Tong ;
Zha, Hongbin .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (05) :796-809
[12]   Research literature clustering using diffusion maps [J].
Nieminen, Paavo ;
Polonen, Ilkka ;
Sipola, Tuomo .
JOURNAL OF INFORMETRICS, 2013, 7 (04) :874-886
[13]   Nonlinear dimensionality reduction by locally linear embedding [J].
Roweis, ST ;
Saul, LK .
SCIENCE, 2000, 290 (5500) :2323-+
[14]   A NONLINEAR MAPPING FOR DATA STRUCTURE ANALYSIS [J].
SAMMON, JW .
IEEE TRANSACTIONS ON COMPUTERS, 1969, C 18 (05) :401-&
[15]   Ensemble-based noise detection: noise ranking and visual performance evaluation [J].
Sluban, Borut ;
Gamberger, Dragan ;
Lavrac, Nada .
DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (02) :265-303
[16]   A global geometric framework for nonlinear dimensionality reduction [J].
Tenenbaum, JB ;
de Silva, V ;
Langford, JC .
SCIENCE, 2000, 290 (5500) :2319-+
[17]  
van der Maaten L., MATLAB Toolbox for Dimensionality Reduction, MATLAB software package
[18]   Saliency generation from complex scene via digraph and Bayesian inference [J].
Wang, Shigang ;
Yang, Shuyuan ;
Liu, Zhengkang ;
Jiao, Licheng .
NEUROCOMPUTING, 2015, 170 :176-186
[19]   Unsupervised learning of image manifolds by semidefinite programming [J].
Weinberger, Kilian Q. ;
Saul, Lawrence K. .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2006, 70 (01) :77-90
[20]   Adaptive nonlinear manifolds and their applications to pattern recognition [J].
Yin, Hujun ;
Huang, Weilin .
INFORMATION SCIENCES, 2010, 180 (14) :2649-2662