On the number of principal components: A test of dimensionality based on measurements of similarity between matrices

被引:95
作者
Dray, Stephane [1 ]
机构
[1] Univ Lyon 1, Univ Lyon, Lab Biomet & Biol Evolut, CNRS,UMR 5558, F-69622 Villeurbanne, France
关键词
co-inertia criterion; permutation procedure; RV coefficient; singular value decomposition; simulation study; stopping rules;
D O I
10.1016/j.csda.2007.07.015
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
An important problem in principal component analysis (PCA) is the estimation of the correct number of components to retain. PCA is most often used to reduce a set of observed variables to a new set of variables of lower dimensionality. The choice of this dimensionality is a crucial step for the interpretation of results or subsequent analyses, because it could lead to a loss of information (underestimation) or the introduction of random noise (overestimation). New techniques are proposed to evaluate the dimensionality in PCA. They are based on similarity measurements, singular value decomposition and permutation procedures. A simulation study is conducted to evaluate the relative merits of the proposed approaches. Results showed that one method based on the RV coefficient is very accurate and seems to be more efficient than other existing approaches. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:2228 / 2237
页数:10
相关论文
共 28 条
[1]  
[Anonymous], 1993, COMPUTER INTENSIVE M
[2]  
BENZECRI JP, 1969, METHODOLOGIES PATTER, P35
[3]   PCA STABILITY AND CHOICE OF DIMENSIONALITY [J].
BESSE, P .
STATISTICS & PROBABILITY LETTERS, 1992, 13 (05) :405-410
[4]   A response to commentators on "Human embryo research and the language of moral uncertainty" [J].
Cheshire, WP .
AMERICAN JOURNAL OF BIOETHICS, 2004, 4 (01) :W31-W32
[5]  
Daudin J. J., 1988, STAT J THEOR APPL ST, V19, P241
[6]   CO-INERTIA ANALYSIS - AN ALTERNATIVE METHOD FOR STUDYING SPECIES ENVIRONMENT RELATIONSHIPS [J].
DOLEDEC, S ;
CHESSEL, D .
FRESHWATER BIOLOGY, 1994, 31 (03) :277-294
[7]   Co-inertia analysis and the linking of ecological data tables [J].
Dray, S ;
Chessel, D ;
Thioulouse, J .
ECOLOGY, 2003, 84 (11) :3078-3089
[8]   Procrustean co-inertia analysis for the linking of multivariate datasets [J].
Dray, S ;
Chessel, D ;
Thioulouse, J .
ECOSCIENCE, 2003, 10 (01) :110-119
[9]   THE APPROXIMATION OF ONE MATRIX BY ANOTHER OF LOWER RANK [J].
Eckart, Carl ;
Young, Gale .
PSYCHOMETRIKA, 1936, 1 (03) :211-218
[10]  
Escoufier Y., 1973, Biometrics, V29, P751, DOI [10.2307/2529140, DOI 10.2307/2529140]