Principal Variables Analysis for Non-Gaussian Data

被引:0
|
作者
Clark-Boucher, Dylan [1 ]
Miller, Jeffrey W. [1 ]
机构
[1] Harvard Univ, Dept Biostat, 655 Huntington Ave, Boston, MA 02115 USA
关键词
Non-normality; Ordinal data; Variable selection; X-linked dystonia parkinsonism; COMPONENT ANALYSIS; DISCARDING VARIABLES; ALGORITHMS; SELECTION;
D O I
10.1080/10618600.2024.2367098
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Principal variables analysis (PVA) is a technique for selecting a subset of variables that capture as much of the information in a dataset as possible. Existing approaches for PVA are based on the Pearson correlation matrix, which is not well-suited to describing the relationships between non-Gaussian variables. We propose a generalized approach to PVA enabling the use of different types of correlation, and we explore using Spearman, Gaussian copula, and polychoric correlations as alternatives to Pearson correlation. We compare performance in simulation studies varying the form of the true multivariate distribution over a range of possibilities. Our results show that on continuous non-Gaussian data, using generalized PVA with Gaussian copula or Spearman correlations provides a major improvement in performance compared to Pearson. On ordinal data, generalized PVA with polychoric correlations outperforms the rest by a wide margin. We apply generalized PVA to a dataset of 102 clinical variables measured on individuals with X-linked dystonia parkinsonism (XDP), a neurodegenerative disorder involving symptoms of both dystonia and parkinsonism. We find that using different types of correlation yields substantively different sets of principal variables; for example, parkinsonism-related metrics appear more explanatory than dystonia-related metrics on the observed data. Supplementary materials for this article are available online.
引用
收藏
页码:374 / 383
页数:10
相关论文
共 50 条
  • [1] Functional principal component analysis estimator for non-Gaussian data
    Zhong, Rou
    Liu, Shishi
    Li, Haocheng
    Zhang, Jingxiao
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2022, 92 (13) : 2788 - 2801
  • [2] Robust functional principal component analysis for non-Gaussian longitudinal data
    Zhong, Rou
    Liu, Shishi
    Li, Haocheng
    Zhang, Jingxiao
    JOURNAL OF MULTIVARIATE ANALYSIS, 2022, 189
  • [3] Compressed Principal Component Analysis of Non-Gaussian Vectors
    Mignolet, Marc
    Soize, Christian
    SIAM-ASA JOURNAL ON UNCERTAINTY QUANTIFICATION, 2020, 8 (04): : 1261 - 1286
  • [4] Analysis of non-Gaussian POLSAR data
    Doulgeris, Anthony
    Anfinsen, Stian Normann
    Eltoft, Torbjorn
    IGARSS: 2007 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-12: SENSING AND UNDERSTANDING OUR PLANET, 2007, : 160 - 163
  • [5] Functional Principal Component Analysis for Continuous Non-Gaussian, Truncated, and Discrete Functional Data
    Dey, Debangan
    Ghosal, Rahul
    Merikangas, Kathleen
    Zipunnikov, Vadim
    STATISTICS IN MEDICINE, 2024, 43 (28) : 5431 - 5445
  • [6] S-chart for non-Gaussian variables
    Sim, CH
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2000, 65 (02) : 147 - 156
  • [7] Experimental non-Gaussian manipulation of continuous variables
    Wenger, Jerome
    Ourjoumtsev, Alexei
    Laurat, Julien
    Tualle-Brouri, Rosa
    Grangier, Philippe
    QUANTUM INFORMATION WITH CONTINOUS VARIABLES OF ATOMS AND LIGHT, 2007, : 389 - +
  • [8] Fluctuation relations with intermittent non-Gaussian variables
    Budini, Adrian A.
    PHYSICAL REVIEW E, 2011, 84 (06):
  • [9] Non-Gaussian Penalized PARAFAC Analysis for fMRI Data
    Liang, Jingsai
    Zou, Jiancheng
    Hong, Don
    FRONTIERS IN APPLIED MATHEMATICS AND STATISTICS, 2019, 5
  • [10] Elements of a non-gaussian analysis on the spaces of functions of infinitely many variables
    Kachanovsky N.A.
    Ukrainian Mathematical Journal, 2011, 62 (9) : 1420 - 1448