Exploring dimension learning via a penalized probabilistic principal component analysis

被引：3

作者：

Deng, Wei Q. ^{[1
,2
]}

Craiu, Radu, V ^{[3
]}

机构：

[1] McMaster Univ, Dept Psychiat & Behav Neurosci, Hamilton, ON, Canada

[2] St Josephs Healthcare Hamilton, Peter Boris Ctr Addict Res, Hamilton, ON, Canada

[3] Univ Toronto, Dept Stat Sci, Toronto, ON, Canada

来源：

JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION | 2023年 / 93卷 / 02期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Dimension estimation; model selection; penalization; principal component analysis; probabilistic principal component analysis; profile likelihood; SELECTION; COVARIANCE; NUMBER; EIGENVALUES; SHRINKAGE; TESTS;

D O I：

10.1080/00949655.2022.2100890

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Establishing a low-dimensional representation of the data leads to efficient data learning strategies. In many cases, the reduced dimension needs to be explicitly stated and estimated from the data. We explore the estimation of dimension in finite samples as a constrained optimization problem, where the estimated dimension is a maximizer of a penalized profile likelihood criterion within the framework of a probabilistic principal components analysis. Unlike other penalized maximization problems that require an 'optimal' penalty tuning parameter, we propose a data-averaging procedure whereby the estimated dimension emerges as the most favourable choice over a range of plausible penalty parameters. The proposed heuristic is compared to a large number of alternative criteria in simulations and an application to gene expression data. Extensive simulation studies reveal that none of the methods uniformly dominate the other and highlight the importance of subject-specific knowledge in choosing statistical methods for dimension learning. Our application results also suggest that gene expression data have a higher intrinsic dimension than previously thought. Overall, our proposed heuristic strikes a good balance and is the method of choice when model assumptions deviated moderately.

引用

页码：266 / 297

页数：32

共 50 条

[31] An Infinitesimal Probabilistic Model for Principal Component Analysis of Manifold Valued Data
Sommer, Stefan
SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2019, 81 (01): : 37 - 62
[32] Process monitoring based on probabilistic principal component analysis for drilling process
Fan, Haipeng
Wu, Min
Lai, Xuzhi
Du, Sheng
Lu, Chengda
Chen, Luefeng
IECON 2021 - 47TH ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2021,
[33] Streaming variational probabilistic principal component analysis for monitoring of nonstationary process
Lu, Cheng
Zeng, Jiusun
Dong, Yuxuan
Xu, Xiaobin
JOURNAL OF PROCESS CONTROL, 2024, 133
[34] Exploring potential coformers for oxyresveratrol using principal component analysis
Ouiyangkul, Passaporn
Tantishaiyakul, Vimon
Hirun, Namon
INTERNATIONAL JOURNAL OF PHARMACEUTICS, 2020, 587
[35] Exploring the Qualities of a Good Leader Using Principal Component Analysis
Olanrewaju O.I.
Okorie V.N.
Journal of Engineering, Project, and Production Management, 2019, 9 (02): : 142 - 150
[36] Decomposable Principal Component Analysis
Wiesel, Ami
Hero, Alfred O.
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2009, 57 (11) : 4369 - 4377
[37] Demonstrating the Mechanics of Principal Component Analysis via Spreadsheets
Brusco, Michael
SPREADSHEETS IN EDUCATION, 2018, 11 (01):
[38] Sparse Principal Component Analysis via Rotation and Truncation
Hu, Zhenfang
Pan, Gang
Wang, Yueming
Wu, Zhaohui
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (04) : 875 - 890
[39] Tensor principal component analysis via convex optimization
Jiang, Bo
Ma, Shiqian
Zhang, Shuzhong
MATHEMATICAL PROGRAMMING, 2015, 150 (02) : 423 - 457
[40] SPARSE PRINCIPAL COMPONENT ANALYSIS VIA VARIABLE PROJECTION
Erichson, N. Benjamin
Zheng, Peng
Manohar, Krithika
Brunton, Steven L.
Kutz, J. Nathan
Aravkin, Aleksandr Y.
SIAM JOURNAL ON APPLIED MATHEMATICS, 2020, 80 (02) : 977 - 1002

← 1 2 3 4 5 →