Dimension estimation;
model selection;
penalization;
principal component analysis;
probabilistic principal component analysis;
profile likelihood;
SELECTION;
COVARIANCE;
NUMBER;
EIGENVALUES;
SHRINKAGE;
TESTS;
D O I:
10.1080/00949655.2022.2100890
中图分类号:
TP39 [计算机的应用];
学科分类号:
081203 ;
0835 ;
摘要:
Establishing a low-dimensional representation of the data leads to efficient data learning strategies. In many cases, the reduced dimension needs to be explicitly stated and estimated from the data. We explore the estimation of dimension in finite samples as a constrained optimization problem, where the estimated dimension is a maximizer of a penalized profile likelihood criterion within the framework of a probabilistic principal components analysis. Unlike other penalized maximization problems that require an 'optimal' penalty tuning parameter, we propose a data-averaging procedure whereby the estimated dimension emerges as the most favourable choice over a range of plausible penalty parameters. The proposed heuristic is compared to a large number of alternative criteria in simulations and an application to gene expression data. Extensive simulation studies reveal that none of the methods uniformly dominate the other and highlight the importance of subject-specific knowledge in choosing statistical methods for dimension learning. Our application results also suggest that gene expression data have a higher intrinsic dimension than previously thought. Overall, our proposed heuristic strikes a good balance and is the method of choice when model assumptions deviated moderately.
机构:
Yunnan Univ Finance & Econ, Sch Math & Stat, Kunming 650221, Peoples R ChinaYunnan Univ Finance & Econ, Sch Math & Stat, Kunming 650221, Peoples R China
Zhao, Jianhua
Yu, Philip L. H.
论文数: 0引用数: 0
h-index: 0
机构:
Univ Hong Kong, Dept Stat & Actuarial Sci, Hong Kong, Hong Kong, Peoples R ChinaYunnan Univ Finance & Econ, Sch Math & Stat, Kunming 650221, Peoples R China
Yu, Philip L. H.
Kwok, James T.
论文数: 0引用数: 0
h-index: 0
机构:
Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R ChinaYunnan Univ Finance & Econ, Sch Math & Stat, Kunming 650221, Peoples R China
机构:
Univ Calif Santa Barbara, Dept Stat & Appl Probabil, 5511 South Hall, Santa Barbara, CA 93106 USAUniv Calif Santa Barbara, Dept Stat & Appl Probabil, 5511 South Hall, Santa Barbara, CA 93106 USA
Gu, Mengyang
Shen, Weining
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Irvine, Dept Stat, 2206 Bren Hall, Irvine, CA 92697 USAUniv Calif Santa Barbara, Dept Stat & Appl Probabil, 5511 South Hall, Santa Barbara, CA 93106 USA