Discriminant analysis of high-dimensional data: A comparison of principal components analysis and partial least squares data reduction methods

被引:192
|
作者
Kemsley, EK
机构
[1] Institute of Food Research, Colney, Norwich NR4 7UA, Norwich Research Park
关键词
partial least squares; principal components analysis; linear discriminant analysis; infrared spectroscopy;
D O I
10.1016/0169-7439(95)00090-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Partial least squares (PLS) methods are presented as valuable alternatives to principal components analysis (PCA) for compressing high-dimensional data before performing linear discriminant analysis (LDA). It is shown that using PLS, considerable improvement in class separation and thus discriminant ability can be obtained. In general, fewer of the compressed dimensions are required to give the same level of prediction successes, and for some data sets, PLS methods yield higher prediction success rates than those obtainable using PCA scores. Results are presented for two experimental data sets, comprising mid-infrared spectra of edible oils and plant seeds. The potential dangers of PLS methods are also demonstrated, in particular its ability to introduce apparent groupings into data where there is no inherent class structure.
引用
收藏
页码:47 / 61
页数:15
相关论文
共 50 条
  • [21] Optimal Linear Discriminant Analysis for High-Dimensional Functional Data
    Xue, Kaijie
    Yang, Jin
    Yao, Fang
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (546) : 1055 - 1064
  • [22] High-dimensional integrative copula discriminant analysis for multiomics data
    He, Yong
    Chen, Hao
    Sun, Hao
    Ji, Jiadong
    Shi, Yufeng
    Zhang, Xinsheng
    Liu, Lei
    STATISTICS IN MEDICINE, 2020, 39 (30) : 4869 - 4884
  • [23] Diagonal Discriminant Analysis With Feature Selection for High-Dimensional Data
    Romanes, Sarah E.
    Ormerod, John T.
    Yang, Jean Y. H.
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2020, 29 (01) : 114 - 127
  • [24] Powered partial least squares discriminant analysis
    Liland, Kristian Hovde
    Indahl, Ulf Geir
    JOURNAL OF CHEMOMETRICS, 2009, 23 (1-2) : 7 - 18
  • [25] SAS® partial least squares for discriminant analysis
    Reeves, James B., III
    Delwiche, Stephen R.
    JOURNAL OF NEAR INFRARED SPECTROSCOPY, 2008, 16 (01) : 31 - 38
  • [26] Evaluating the performance of sparse principal component analysis methods in high-dimensional data scenarios
    Bonner, Ashley J.
    Beyene, Joseph
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2017, 46 (05) : 3794 - 3811
  • [27] Multiset sparse partial least squares path modeling for high dimensional omics data analysis
    Csala, Attila
    Zwinderman, Aeilko H.
    Hof, Michel H.
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [28] Multiset sparse partial least squares path modeling for high dimensional omics data analysis
    Attila Csala
    Aeilko H. Zwinderman
    Michel H. Hof
    BMC Bioinformatics, 21
  • [29] Lagged principal trend analysis for longitudinal high-dimensional data
    Zhang, Yuping
    STAT, 2019, 8 (01):
  • [30] Multilevel Functional Principal Component Analysis for High-Dimensional Data
    Zipunnikov, Vadim
    Caffo, Brian
    Yousem, David M.
    Davatzikos, Christos
    Schwartz, Brian S.
    Crainiceanu, Ciprian
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2011, 20 (04) : 852 - 873