Exploring dimension learning via a penalized probabilistic principal component analysis

被引:3
作者
Deng, Wei Q. [1 ,2 ]
Craiu, Radu, V [3 ]
机构
[1] McMaster Univ, Dept Psychiat & Behav Neurosci, Hamilton, ON, Canada
[2] St Josephs Healthcare Hamilton, Peter Boris Ctr Addict Res, Hamilton, ON, Canada
[3] Univ Toronto, Dept Stat Sci, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Dimension estimation; model selection; penalization; principal component analysis; probabilistic principal component analysis; profile likelihood; SELECTION; COVARIANCE; NUMBER; EIGENVALUES; SHRINKAGE; TESTS;
D O I
10.1080/00949655.2022.2100890
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Establishing a low-dimensional representation of the data leads to efficient data learning strategies. In many cases, the reduced dimension needs to be explicitly stated and estimated from the data. We explore the estimation of dimension in finite samples as a constrained optimization problem, where the estimated dimension is a maximizer of a penalized profile likelihood criterion within the framework of a probabilistic principal components analysis. Unlike other penalized maximization problems that require an 'optimal' penalty tuning parameter, we propose a data-averaging procedure whereby the estimated dimension emerges as the most favourable choice over a range of plausible penalty parameters. The proposed heuristic is compared to a large number of alternative criteria in simulations and an application to gene expression data. Extensive simulation studies reveal that none of the methods uniformly dominate the other and highlight the importance of subject-specific knowledge in choosing statistical methods for dimension learning. Our application results also suggest that gene expression data have a higher intrinsic dimension than previously thought. Overall, our proposed heuristic strikes a good balance and is the method of choice when model assumptions deviated moderately.
引用
收藏
页码:266 / 297
页数:32
相关论文
共 50 条
  • [41] Stable Principal Component Pursuit via Convex Analysis
    Yin, Lei
    Parekh, Ankit
    Selesnick, Ivan
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (10) : 2595 - 2607
  • [42] On coMADs and Principal Component Analysis
    Kazempour, Daniyal
    Huenemoerder, M. A. X.
    Seidl, Thomas
    SIMILARITY SEARCH AND APPLICATIONS (SISAP 2019), 2019, 11807 : 273 - 280
  • [43] Robust Hebbian learning and noisy principal component analysis
    Diamantaras, KI
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 1998, 67 (1-2) : 5 - 24
  • [44] Mixtures of robust probabilistic principal component analyzers
    Archambeau, Cedric
    Delannay, Nicolas
    Verleysen, Michel
    NEUROCOMPUTING, 2008, 71 (7-9) : 1274 - 1282
  • [45] PRINCIPAL COMPONENT ANALYSIS AND BLOOM TAXONOMY TO PERSONALISE LEARNING
    Gostautaite, D.
    EDULEARN19: 11TH INTERNATIONAL CONFERENCE ON EDUCATION AND NEW LEARNING TECHNOLOGIES, 2019, : 2910 - 2920
  • [46] Transfer Learning for Bayesian Optimization with Principal Component Analysis
    Masui, Hideyuki
    Romeres, Diego
    Nikovski, Daniel
    2022 21ST IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, ICMLA, 2022, : 1077 - 1084
  • [47] Tensor principal component analysis via convex optimization
    Bo Jiang
    Shiqian Ma
    Shuzhong Zhang
    Mathematical Programming, 2015, 150 : 423 - 457
  • [48] MAHALANOBIS KERNEL BASED ON PROBABILISTIC PRINCIPAL COMPONENT
    Fauvel, M.
    Villa, A.
    Chanussot, J.
    Benediktsson, J. A.
    2011 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS), 2011, : 3907 - 3910
  • [49] Estimating the number of signals using principal component analysis
    Virta, Joni
    Nordhausen, Klaus
    STAT, 2019, 8 (01):
  • [50] Exploring the association of dietary patterns with the risk of hypertension using principal balances analysis and principal component analysis
    Zhao, Junkang
    Guo, Wenjing
    Wang, Juping
    Wang, Tong
    PUBLIC HEALTH NUTRITION, 2023, 26 (01) : 160 - 170