CONSISTENCY OF AIC AND BIC IN ESTIMATING THE NUMBER OF SIGNIFICANT COMPONENTS IN HIGH-DIMENSIONAL PRINCIPAL COMPONENT ANALYSIS

被引:33
作者
Bai, Zhidong [1 ,2 ]
Choi, Kwok Pui [3 ]
Fujikoshi, Yasunori [4 ]
机构
[1] Northeast Normal Univ, KLAS MOE, Changchun 130024, Jilin, Peoples R China
[2] Northeast Normal Univ, Sch Math & Stat, Changchun 130024, Jilin, Peoples R China
[3] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore 117546, Singapore
[4] Hiroshima Univ, Grad Sch Sci, Dept Math, Hiroshima 7398526, Japan
关键词
AIC; BIC; consistency; dimensionality; high-dimensional framework; number of significant components; principal component analysis; random matrix theory; signal processing; spiked model; SMALLEST EIGENVALUES; COVARIANCE-MATRIX; MODEL SELECTION; LINEAR-MODEL; CRITERIA; REGRESSION; EQUALITY;
D O I
10.1214/17-AOS1577
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper, we study the problem of estimating the number of significant components in principal component analysis (PCA), which corresponds to the number of dominant eigenvalues of the covariance matrix of p variables. Our purpose is to examine the consistency of the estimation criteria AIC and BIC based on the model selection criteria by Akaike [In 2nd International Symposium on Information Theory (1973) 267-281, Akademia Kiado] and Schwarz [Estimating the dimension of a model 6 (1978) 461464] under a high-dimensional asymptotic framework. Using random matrix theory techniques, we derive sufficient conditions for the criterion to be strongly consistent for the case when the dominant population eigenvalues are bounded, and when the dominant eigenvalues tend to infinity. Moreover, the asymptotic results are obtained without normality assumption on the population distribution. Simulation studies are also conducted, and results show that the sufficient conditions in our theorems are essential.
引用
收藏
页码:1050 / 1076
页数:27
相关论文
共 30 条
[1]  
[Anonymous], 1973, 2 INT S INF THEOR BU, DOI [10.1007/978-1-4612-0919-5_38, 10.1007/978-0-387-98135-2, DOI 10.1007/978-1-4612-0919-538, 10.1007/978-1-4612-1694-0]
[2]  
Bai ZD, 1998, ANN PROBAB, V26, P316
[3]   LIMIT OF THE SMALLEST EIGENVALUE OF A LARGE DIMENSIONAL SAMPLE COVARIANCE-MATRIX [J].
BAI, ZD ;
YIN, YQ .
ANNALS OF PROBABILITY, 1993, 21 (03) :1275-1294
[4]  
BAI ZD, 1990, ADV SPECTRUM ANAL AR, V2, P327
[5]   On sample eigenvalues in a generalized spiked population model [J].
Bai, Zhidong ;
Yao, Jianfeng .
JOURNAL OF MULTIVARIATE ANALYSIS, 2012, 106 :167-177
[6]   SELECTION OF COMPONENTS IN PRINCIPAL COMPONENT ANALYSIS - A COMPARISON OF METHODS [J].
FERRE, L .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1995, 19 (06) :669-682
[7]  
Fujikoshi Y., 2010, MULTIVARIATE STAT HI
[8]   Asymptotic distribution of the LR statistic for equality of the smallest eigenvalues in high-dimensional principal component analysis [J].
Fujikoshi, Yasunori ;
Yamada, Takayuki ;
Watanabe, Daisuke ;
Sugiyama, Takakazu .
JOURNAL OF MULTIVARIATE ANALYSIS, 2007, 98 (10) :2002-2008
[9]   High-dimensional consistency of rank estimation criteria in multivariate linear model [J].
Fujikoshi, Yasunori ;
Sakurai, Tetsuro .
JOURNAL OF MULTIVARIATE ANALYSIS, 2016, 149 :199-212
[10]   Consistency of high-dimensional AIC-type and Cp-type criteria in multivariate linear regression [J].
Fujikoshi, Yasunori ;
Sakurai, Tetsuro ;
Yanagihara, Hirokazu .
JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 123 :184-200