On the number of principal components: A test of dimensionality based on measurements of similarity between matrices

被引:95
作者
Dray, Stephane [1 ]
机构
[1] Univ Lyon 1, Univ Lyon, Lab Biomet & Biol Evolut, CNRS,UMR 5558, F-69622 Villeurbanne, France
关键词
co-inertia criterion; permutation procedure; RV coefficient; singular value decomposition; simulation study; stopping rules;
D O I
10.1016/j.csda.2007.07.015
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
An important problem in principal component analysis (PCA) is the estimation of the correct number of components to retain. PCA is most often used to reduce a set of observed variables to a new set of variables of lower dimensionality. The choice of this dimensionality is a crucial step for the interpretation of results or subsequent analyses, because it could lead to a loss of information (underestimation) or the introduction of random noise (overestimation). New techniques are proposed to evaluate the dimensionality in PCA. They are based on similarity measurements, singular value decomposition and permutation procedures. A simulation study is conducted to evaluate the relative merits of the proposed approaches. Results showed that one method based on the RV coefficient is very accurate and seems to be more efficient than other existing approaches. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:2228 / 2237
页数:10
相关论文
共 28 条
[21]  
MANTEL N, 1967, CANCER RES, V27, P209
[22]   INSTABILITIES OF REGRESSION ESTIMATES RELATING AIR-POLLUTION TO MORTALITY [J].
MCDONALD, GC ;
SCHWING, RC .
TECHNOMETRICS, 1973, 15 (03) :463-481
[23]   How well do multivariate data sets match? The advantages of a Procrustean superimposition approach over the Mantel test [J].
Peres-Neto, PR ;
Jackson, DA .
OECOLOGIA, 2001, 129 (02) :169-178
[24]   How many principal components? stopping rules for determining the number of non-trivial axes revisited [J].
Peres-Neto, PR ;
Jackson, DA ;
Somers, KM .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2005, 49 (04) :974-997
[25]  
Robert P., 1976, APPL STAT, P257, DOI [10.2307/2347233, DOI 10.2307/2347233]
[26]   ASSESSING THE STABILITY OF PRINCIPAL COMPONENTS USING REGRESSION [J].
SINHA, AR ;
BUCHANAN, BS .
PSYCHOMETRIKA, 1995, 60 (03) :355-369
[27]   AN ANALYSIS AND SYNTHESIS OF MULTIPLE CORRESPONDENCE-ANALYSIS, OPTIMAL-SCALING, DUAL SCALING, HOMOGENEITY ANALYSIS AND OTHER METHODS FOR QUANTIFYING CATEGORICAL MULTIVARIATE DATA [J].
TENENHAUS, M ;
YOUNG, FW .
PSYCHOMETRIKA, 1985, 50 (01) :91-119
[28]   REDUNDANCY ANALYSIS AN ALTERNATIVE FOR CANONICAL CORRELATION ANALYSIS [J].
VANDENWOLLENBERG, AL .
PSYCHOMETRIKA, 1977, 42 (02) :207-219