Reflections on univariate and multivariate analysis of metabolomics data

被引:467
作者
Saccenti, Edoardo [1 ,2 ]
Hoefsloot, Huub C. J. [1 ,2 ]
Smilde, Age K. [1 ,2 ]
Westerhuis, Johan A. [1 ,2 ]
Hendriks, Margriet M. W. B. [2 ,3 ]
机构
[1] Univ Amsterdam, Swammerdam Inst Life Sci, Biosyst Data Anal Grp, NL-1098 XH Amsterdam, Netherlands
[2] Netherlands Metabol Ctr, NL-2333 CL Leiden, Netherlands
[3] Leiden Acad Ctr Drug Res, NL-2333 CL Leiden, Netherlands
关键词
Univariate analysis; Multivariate analysis; Hypothesis testing; Multiple test correction; Overfitting; Consistency at large; NMR-BASED METABOLOMICS; STATISTICAL VALIDATION; DISCRIMINANT-ANALYSIS; SHRUNKEN CENTROIDS; POWERFUL APPROACH; FEATURE-SELECTION; HIGHER CRITICISM; GENE-EXPRESSION; DATA SETS; CLASSIFICATION;
D O I
10.1007/s11306-013-0598-6
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Metabolomics experiments usually result in a large quantity of data. Univariate and multivariate analysis techniques are routinely used to extract relevant information from the data with the aim of providing biological knowledge on the problem studied. Despite the fact that statistical tools like the t test, analysis of variance, principal component analysis, and partial least squares discriminant analysis constitute the backbone of the statistical part of the vast majority of metabolomics papers, it seems that many basic but rather fundamental questions are still often asked, like: Why do the results of univariate and multivariate analyses differ? Why apply univariate methods if you have already applied a multivariate method? Why if I do not see something univariately I see something multivariately? In the present paper we address some aspects of univariate and multivariate analysis, with the scope of clarifying in simple terms the main differences between the two approaches. Applications of the t test, analysis of variance, principal component analysis and partial least squares discriminant analysis will be shown on both real and simulated metabolomics data examples to provide an overview on fundamental aspects of univariate and multivariate methods.
引用
收藏
页码:361 / 374
页数:14
相关论文
共 64 条
[1]   Sparse non-negative generalized PCA with applications to metabolomics [J].
Allen, Genevera I. ;
Maletic-Savatic, Mirjana .
BIOINFORMATICS, 2011, 27 (21) :3029-3035
[2]  
[Anonymous], 2002, Principal components analysis
[3]   Metabolic profiling, metabolomic and metabonomic procedures for NMR spectroscopy of urine, plasma, serum and tissue extracts [J].
Beckonert, Olaf ;
Keun, Hector C. ;
Ebbels, Timothy M. D. ;
Bundy, Jacob G. ;
Holmes, Elaine ;
Lindon, John C. ;
Nicholson, Jeremy K. .
NATURE PROTOCOLS, 2007, 2 (11) :2692-2703
[4]   On the adaptive control of the false discovery fate in multiple testing with independent statistics [J].
Benjamini, Y ;
Hochberg, Y .
JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2000, 25 (01) :60-83
[5]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]   Consequences of sample size, variable selection, and model validation and optimisation, for predicting classification ability from analytical data [J].
Brereton, Richard G. .
TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2006, 25 (11) :1103-1111
[7]   Statistical strategies for avoiding false discoveries in metabolomics and related experiments [J].
Broadhurst, David I. ;
Kell, Douglas B. .
METABOLOMICS, 2006, 2 (04) :171-196
[8]   OPLS discriminant analysis:: combining the strengths of PLS-DA and SIMCA classification [J].
Bylesjo, Max ;
Rantalainen, Mattias ;
Cloarec, Olivier ;
Nicholson, Jeremy K. ;
Holmes, Elaine ;
Trygg, Johan .
JOURNAL OF CHEMOMETRICS, 2006, 20 (8-10) :341-351
[9]   A Critical Assessment of Feature Selection Methods for Biomarker Discovery in Clinical Proteomics [J].
Christin, Christin ;
Hoefsloot, Huub C. J. ;
Smilde, Age K. ;
Hoekman, B. ;
Suits, Frank ;
Bischoff, Rainer ;
Horvatovich, Peter .
MOLECULAR & CELLULAR PROTEOMICS, 2013, 12 (01) :263-276
[10]   Scaling and normalization effects in NMR spectroscopic metabonomic data sets [J].
Craig, A ;
Cloareo, O ;
Holmes, E ;
Nicholson, JK ;
Lindon, JC .
ANALYTICAL CHEMISTRY, 2006, 78 (07) :2262-2267