The effect of principal component analysis on machine learning accuracy with high dimensional spectral data

被引:70
|
作者
Howley, T [1 ]
Madden, MG [1 ]
O'Connell, ML [1 ]
Ryder, AG [1 ]
机构
[1] Natl Univ Ireland Univ Coll Galway, Galway, Ireland
关键词
D O I
10.1007/1-84628-224-1_16
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents the results of an investigation into the use of machine learning methods for the identification of narcotics from Raman spectra. The classification of spectral data and other high dimensional data, such as images, gene-expression data and spectral data, poses an interesting challenge to machine learning, as the presence of high numbers of redundant or highly correlated attributes can seriously degrade classification accuracy. This paper investigates the use of Principal Component Analysis (PCA) to reduce high dimensional spectral data and to improve the predictive performance of some well known machine learning methods. Experiments are carried out on a high dimensional spectral dataset. These experiments employ the NIPALS (Non-Linear Iterative Partial Least Squares) PCA method, a method that has been used in the field of chemometrics for spectral classification, and is a more efficient alternative than the widely used eigenvector decomposition approach. The experiments show that the use of this PCA method can improve the performance of machine learning in the classification of high dimensionsal data.
引用
收藏
页码:209 / +
页数:3
相关论文
共 50 条
  • [21] High-dimensional covariance forecasting based on principal component analysis of high-frequency data
    Jian, Zhihong
    Deng, Pingjun
    Zhu, Zhican
    ECONOMIC MODELLING, 2018, 75 : 422 - 431
  • [22] Principal Component Analysis of High-Frequency Data
    Ait-Sahalia, Yacine
    Xiu, Dacheng
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2019, 114 (525) : 287 - 303
  • [23] The effect of the thermal infrared data on principal component analysis of multi-spectral remotely-sensed data
    Agassi, E
    Ben Yosef, N
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 1998, 19 (09) : 1683 - 1694
  • [24] Machine learning for analysis of atomic spectral data
    Cianciosa, M.
    Law, K. J. H.
    Martin, E. H.
    Green, D. L.
    JOURNAL OF QUANTITATIVE SPECTROSCOPY & RADIATIVE TRANSFER, 2020, 240
  • [25] Microaneurysm Detection Using Principal Component Analysis and Machine Learning Methods
    Cao, Wen
    Czarnek, Nicholas
    Shan, Juan
    Li, Lin
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2018, 17 (03) : 191 - 198
  • [26] Stock Index Prediction Based on Principal Component Analysis and Machine Learning
    Zhu, Shitao
    Zhao, Ming
    Wei, Shengqing
    An, Simeng
    2020 INTERNATIONAL CONFERENCE ON BIG DATA & ARTIFICIAL INTELLIGENCE & SOFTWARE ENGINEERING (ICBASE 2020), 2020, : 246 - 249
  • [27] PRINCIPAL COMPONENT ANALYSIS IN VERY HIGH-DIMENSIONAL SPACES
    Lee, Young Kyung
    Lee, Eun Ryung
    Park, Byeong U.
    STATISTICA SINICA, 2012, 22 (03) : 933 - 956
  • [28] Test for high-dimensional outliers with principal component analysis
    Nakayama, Yugo
    Yata, Kazuyoshi
    Aoshima, Makoto
    JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2024, 7 (02) : 739 - 766
  • [29] Evaluating the performance of sparse principal component analysis methods in high-dimensional data scenarios
    Bonner, Ashley J.
    Beyene, Joseph
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2017, 46 (05) : 3794 - 3811
  • [30] The Goodness of Sample Loadings of Principal Component Analysis in Approximating to Factor Loadings with High Dimensional Data
    Liang, Lu
    Hayashi, Kentaro
    Yuan, Ke-Hai
    QUANTITATIVE PSYCHOLOGY RESEARCH, 2016, 167 : 199 - 211