Exploring dimension learning via a penalized probabilistic principal component analysis

被引:3
|
作者
Deng, Wei Q. [1 ,2 ]
Craiu, Radu, V [3 ]
机构
[1] McMaster Univ, Dept Psychiat & Behav Neurosci, Hamilton, ON, Canada
[2] St Josephs Healthcare Hamilton, Peter Boris Ctr Addict Res, Hamilton, ON, Canada
[3] Univ Toronto, Dept Stat Sci, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Dimension estimation; model selection; penalization; principal component analysis; probabilistic principal component analysis; profile likelihood; SELECTION; COVARIANCE; NUMBER; EIGENVALUES; SHRINKAGE; TESTS;
D O I
10.1080/00949655.2022.2100890
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Establishing a low-dimensional representation of the data leads to efficient data learning strategies. In many cases, the reduced dimension needs to be explicitly stated and estimated from the data. We explore the estimation of dimension in finite samples as a constrained optimization problem, where the estimated dimension is a maximizer of a penalized profile likelihood criterion within the framework of a probabilistic principal components analysis. Unlike other penalized maximization problems that require an 'optimal' penalty tuning parameter, we propose a data-averaging procedure whereby the estimated dimension emerges as the most favourable choice over a range of plausible penalty parameters. The proposed heuristic is compared to a large number of alternative criteria in simulations and an application to gene expression data. Extensive simulation studies reveal that none of the methods uniformly dominate the other and highlight the importance of subject-specific knowledge in choosing statistical methods for dimension learning. Our application results also suggest that gene expression data have a higher intrinsic dimension than previously thought. Overall, our proposed heuristic strikes a good balance and is the method of choice when model assumptions deviated moderately.
引用
收藏
页码:266 / 297
页数:32
相关论文
共 50 条
  • [31] An Infinitesimal Probabilistic Model for Principal Component Analysis of Manifold Valued Data
    Sommer, Stefan
    SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2019, 81 (01): : 37 - 62
  • [32] Process monitoring based on probabilistic principal component analysis for drilling process
    Fan, Haipeng
    Wu, Min
    Lai, Xuzhi
    Du, Sheng
    Lu, Chengda
    Chen, Luefeng
    IECON 2021 - 47TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2021,
  • [33] Streaming variational probabilistic principal component analysis for monitoring of nonstationary process
    Lu, Cheng
    Zeng, Jiusun
    Dong, Yuxuan
    Xu, Xiaobin
    JOURNAL OF PROCESS CONTROL, 2024, 133
  • [34] Exploring potential coformers for oxyresveratrol using principal component analysis
    Ouiyangkul, Passaporn
    Tantishaiyakul, Vimon
    Hirun, Namon
    INTERNATIONAL JOURNAL OF PHARMACEUTICS, 2020, 587
  • [35] Exploring the Qualities of a Good Leader Using Principal Component Analysis
    Olanrewaju O.I.
    Okorie V.N.
    Journal of Engineering, Project, and Production Management, 2019, 9 (02): : 142 - 150
  • [36] Decomposable Principal Component Analysis
    Wiesel, Ami
    Hero, Alfred O.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2009, 57 (11) : 4369 - 4377
  • [37] Demonstrating the Mechanics of Principal Component Analysis via Spreadsheets
    Brusco, Michael
    SPREADSHEETS IN EDUCATION, 2018, 11 (01):
  • [38] Sparse Principal Component Analysis via Rotation and Truncation
    Hu, Zhenfang
    Pan, Gang
    Wang, Yueming
    Wu, Zhaohui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (04) : 875 - 890
  • [39] Tensor principal component analysis via convex optimization
    Jiang, Bo
    Ma, Shiqian
    Zhang, Shuzhong
    MATHEMATICAL PROGRAMMING, 2015, 150 (02) : 423 - 457
  • [40] SPARSE PRINCIPAL COMPONENT ANALYSIS VIA VARIABLE PROJECTION
    Erichson, N. Benjamin
    Zheng, Peng
    Manohar, Krithika
    Brunton, Steven L.
    Kutz, J. Nathan
    Aravkin, Aleksandr Y.
    SIAM JOURNAL ON APPLIED MATHEMATICS, 2020, 80 (02) : 977 - 1002