Supervised projection pursuit - A dimensionality reduction technique optimized for probabilistic classification

被引:8
作者
Barcaru, Andrei [1 ]
机构
[1] Univ Groningen, Univ Med Ctr Groningen, Dept Lab Med, NL-9700 RB Groningen, Netherlands
关键词
MODELS;
D O I
10.1016/j.chemolab.2019.103867
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An important step in multivariate analysis is the dimensionality reduction, which allows for a better classification and easier visualization of the class structures in the data. Techniques like PCA, PIS-DA and LDA are most often used to explore the patterns in the data and to reduce the dimensions. Yet the data does not always reveal properly the structures wen these techniques are applied. To this end, a supervised projection pursuit (SuPP) is proposed in this article, based on Jensen-Shannon divergence. The combination of this metric with powerful Monte Carlo based optimization algorithm, yielded a versatile dimensionality reduction technique capable of working with highly dimensional data and missing observations. Combined with Naive Bayes (NB) classifier, SuPP proved to be a powerful preprocessing tool for classification. Namely, on the Iris data set, the prediction accuracy of SuPP-NB is significantly higher than the prediction accuracy of PCA-NB, (p-value <= 4.02E-05 in a 2D latent space, p-value <= 3.00E-03 in a 3D latent space) and significantly higher than the prediction accuracy of PLS-DA (p-value <= 1.17E-05 in a 2D latent space and p-value <= 3.08E-03 in a 3D latent space). The significantly higher accuracy for this particular data set is a strong evidence of a better class separation in the latent spaces obtained with SuPP.
引用
收藏
页数:11
相关论文
共 45 条
[1]   Pareto models for discriminative multiclass linear dimensionality reduction [J].
Abou-Moustafa, Karim T. ;
De la Torre, Fernando ;
Ferrie, Frank P. .
PATTERN RECOGNITION, 2015, 48 (05) :1863-1877
[2]   Comparative analysis of nonlinear dimensionality reduction techniques for breast MRI segmentation [J].
Akhbardeh, Alireza ;
Jacobs, Michael A. .
MEDICAL PHYSICS, 2012, 39 (04) :2275-2289
[3]  
ANDERSON EDGAR, 1936, ANN MISSOURI BOT GARD, V23, P457, DOI 10.2307/2394164
[4]   Properties of classical and quantum Jensen-Shannon divergence [J].
Briet, Jop ;
Harremoes, Peter .
PHYSICAL REVIEW A, 2009, 79 (05)
[5]   Analysis of linear and nonlinear dimensionality reduction methods for gender classification of face images [J].
Buchala, S ;
Davey, N ;
Gale, TM ;
Frank, RJ .
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2005, 36 (14) :931-942
[6]   Face recognition by regularized discriminant analysis [J].
Dai, Dao-Qing ;
Yuen, Pong C. .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (04) :1080-1085
[7]  
Errity A., 2007, 15 INT C DIG SIGN PR
[8]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[9]  
Forina M, 1988, PARVUS EXTENDABLE PA
[10]   PROJECTION PURSUIT ALGORITHM FOR EXPLORATORY DATA-ANALYSIS [J].
FRIEDMAN, JH ;
TUKEY, JW .
IEEE TRANSACTIONS ON COMPUTERS, 1974, C 23 (09) :881-890