Exploring dimension learning via a penalized probabilistic principal component analysis

被引:3
|
作者
Deng, Wei Q. [1 ,2 ]
Craiu, Radu, V [3 ]
机构
[1] McMaster Univ, Dept Psychiat & Behav Neurosci, Hamilton, ON, Canada
[2] St Josephs Healthcare Hamilton, Peter Boris Ctr Addict Res, Hamilton, ON, Canada
[3] Univ Toronto, Dept Stat Sci, Toronto, ON, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Dimension estimation; model selection; penalization; principal component analysis; probabilistic principal component analysis; profile likelihood; SELECTION; COVARIANCE; NUMBER; EIGENVALUES; SHRINKAGE; TESTS;
D O I
10.1080/00949655.2022.2100890
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Establishing a low-dimensional representation of the data leads to efficient data learning strategies. In many cases, the reduced dimension needs to be explicitly stated and estimated from the data. We explore the estimation of dimension in finite samples as a constrained optimization problem, where the estimated dimension is a maximizer of a penalized profile likelihood criterion within the framework of a probabilistic principal components analysis. Unlike other penalized maximization problems that require an 'optimal' penalty tuning parameter, we propose a data-averaging procedure whereby the estimated dimension emerges as the most favourable choice over a range of plausible penalty parameters. The proposed heuristic is compared to a large number of alternative criteria in simulations and an application to gene expression data. Extensive simulation studies reveal that none of the methods uniformly dominate the other and highlight the importance of subject-specific knowledge in choosing statistical methods for dimension learning. Our application results also suggest that gene expression data have a higher intrinsic dimension than previously thought. Overall, our proposed heuristic strikes a good balance and is the method of choice when model assumptions deviated moderately.
引用
收藏
页码:266 / 297
页数:32
相关论文
共 50 条
  • [21] A class of learning algorithms for principal component analysis and minor component analysis
    Zhang, QF
    Leung, YW
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2000, 11 (02): : 529 - 533
  • [22] Trajectory Learning Using Principal Component Analysis
    Osman, Asmaa A. E.
    El-Khoribi, Reda A.
    Shoman, Mahmoud E.
    Shalaby, M. A. Wahby
    RECENT ADVANCES IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2017, 569 : 174 - 183
  • [23] Local Sparse Principal Component Analysis for Exploring the Spatial Distribution of Social Infrastructure
    Hong, Seong-Yun
    Moon, Seonggook
    Chi, Sang-Hyun
    Cho, Yoon-Jae
    Kang, Jeon-Young
    LAND, 2022, 11 (11)
  • [24] Classification of College Students' Mobile Learning Strategies Based on Principal Component Analysis and Probabilistic Neural Network
    Hu, Shuai
    Cheng, Yingxin
    PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 58 - 61
  • [25] On multi-fault detection of rolling bearing through probabilistic principal component analysis denoising and Higuchi fractal dimension transformation
    Yang, Xiaomin
    Xiang, Yongbing
    Jiang, Bingzhen
    JOURNAL OF VIBRATION AND CONTROL, 2022, 28 (9-10) : 1214 - 1226
  • [26] Study of Defect Feature Dimension Reduction Based on Principal Component Analysis
    Han Fangfang
    Zhu Junchao
    Zhang Baofeng
    Duan Fajie
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1367 - 1371
  • [27] Speaker Recognition Based on Principal Component Analysis and Probabilistic Neural Network
    Zhou, Yan
    Shang, Li
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2012, 6839 : 708 - 715
  • [28] SELF-PACED PROBABILISTIC PRINCIPAL COMPONENT ANALYSIS FOR DATA WITH OUTLIERS
    Zhao, Bowen
    Xiao, Xi
    Zhang, Wanpeng
    Zhang, Bin
    Gan, Guojun
    Xia, Shutao
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3737 - 3741
  • [29] Robust deflated principal component analysis via multiple instance factorings for dimension reduction in remote sensing images
    Ganaa, Ernest Domanaanmwi
    Abeo, Timothy Apasiba
    Wang, Liangjun
    Song, He-Ping
    Shen, Xiang-Jun
    JOURNAL OF APPLIED REMOTE SENSING, 2020, 14 (03)
  • [30] An Infinitesimal Probabilistic Model for Principal Component Analysis of Manifold Valued Data
    Stefan Sommer
    Sankhya A, 2019, 81 (1): : 37 - 62