Principal component analysis with missing values: a comparative survey of methods

被引:0
|
作者
Stéphane Dray
Julie Josse
机构
[1] Université de Lyon,Applied Mathematics Department
[2] Université Lyon 1,undefined
[3] CNRS,undefined
[4] UMR5558,undefined
[5] Laboratoire de Biométrie et Biologie Evolutive,undefined
[6] Agrocampus Ouest,undefined
来源
Plant Ecology | 2015年 / 216卷
关键词
Imputation; Ordination; PCA; Traits;
D O I
暂无
中图分类号
学科分类号
摘要
Principal component analysis (PCA) is a standard technique to summarize the main structures of a data table containing the measurements of several quantitative variables for a number of individuals. Here, we study the case where some of the data values are missing and propose a review of methods which accommodate PCA to missing data. In plant ecology, this statistical challenge relates to the current effort to compile global plant functional trait databases producing matrices with a large amount of missing values. We present several techniques to consider or estimate (impute) missing values in PCA and compare them using theoretical considerations. We carried out a simulation study to evaluate the relative merits of the different approaches in various situations (correlation structure, number of variables and individuals, and percentage of missing values) and also applied them on a real data set. Lastly, we discuss the advantages and drawbacks of these approaches, the potential pitfalls and future challenges that need to be addressed in the future.
引用
收藏
页码:657 / 667
页数:10
相关论文
共 50 条
  • [21] Quantitative Interpretation of Mineral Hyperspectral Images Based on Principal Component Analysis and Independent Component Analysis Methods
    Jiang, Xiping
    Jiang, Yu
    Wu, Fang
    Wu, Fenghuang
    APPLIED SPECTROSCOPY, 2014, 68 (04) : 502 - 509
  • [22] The Connections between Principal Component Analysis and Dimensionality Reduction Methods of Manifolds
    Li, Bo
    Liu, Jin
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2012, 6839 : 638 - +
  • [23] Classification of phases based on a Principal Component Analysis for Intrusion Detection Methods
    Rajaallah, El Mostafa
    INTERNATIONAL JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE, 2020, 15 (04) : 1221 - 1234
  • [24] IMPUTATION OF MISSING DATA USING BAYESIAN PRINCIPAL COMPONENT ANALYSIS ON TEC IONOSPHERIC SATELLITE DATASET
    Subashini, P.
    Krishnaveni, M.
    2011 24TH CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2011, : 1540 - 1543
  • [25] Anomaly Detection Based on Kernel Principal Component and Principal Component Analysis
    Wang, Wei
    Zhang, Min
    Wang, Dan
    Jiang, Yu
    Li, Yuliang
    Wu, Hongda
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, 2019, 463 : 2222 - 2228
  • [26] Exploration of Principal Component Analysis: Deriving Principal Component Analysis Visually Using Spectra
    Beattie, J. Renwick
    Esmonde-White, Francis W. L.
    APPLIED SPECTROSCOPY, 2021, 75 (04) : 361 - 375
  • [27] A review on missing values for main challenges and methods
    Ren, Lijuan
    Wang, Tao
    Seklouli, Aicha Sekhari
    Zhang, Haiqing
    Bouras, Abdelaziz
    INFORMATION SYSTEMS, 2023, 119
  • [28] Weighted Principal Component Analysis
    Fan, Zizhu
    Liu, Ergen
    Xu, Baogen
    ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT III, 2011, 7004 : 569 - 574
  • [29] Ensemble Principal Component Analysis
    Dorabiala, Olga
    Aravkin, Aleksandr Y.
    Kutz, J. Nathan
    IEEE ACCESS, 2024, 12 : 6663 - 6671
  • [30] A Generalization of Principal Component Analysis
    Battaglino, Samuele
    Koyuncu, Erdem
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3607 - 3611