State-of-the art data normalization methods improve NMR-based metabolomic analysis

被引:163
作者
Kohl, Stefanie M. [1 ]
Klein, Matthias S. [1 ]
Hochrein, Jochen [1 ]
Oefner, Peter J. [1 ]
Spang, Rainer [1 ]
Gronwald, Wolfram [1 ]
机构
[1] Univ Regensburg, Inst Funct Genom, D-93053 Regensburg, Germany
关键词
Metabolomics; NMR; Data normalization; Preprocessing; Classification; H-1-NMR; SPECTROSCOPY; METABONOMICS; PROTEOMICS; ALIGNMENT; VARIANCE; URINE;
D O I
10.1007/s11306-011-0350-z
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Extracting biomedical information from large metabolomic datasets by multivariate data analysis is of considerable complexity. Common challenges include among others screening for differentially produced metabolites, estimation of fold changes, and sample classification. Prior to these analysis steps, it is important to minimize contributions from unwanted biases and experimental variance. This is the goal of data preprocessing. In this work, different data normalization methods were compared systematically employing two different datasets generated by means of nuclear magnetic resonance (NMR) spectroscopy. To this end, two different types of normalization methods were used, one aiming to remove unwanted sample-to-sample variation while the other adjusts the variance of the different metabolites by variable scaling and variance stabilization methods. The impact of all methods tested on sample classification was evaluated on urinary NMR fingerprints obtained from healthy volunteers and patients suffering from autosomal polycystic kidney disease (ADPKD). Performance in terms of screening for differentially produced metabolites was investigated on a dataset following a Latin-square design, where varied amounts of 8 different metabolites were spiked into a human urine matrix while keeping the total spike-in amount constant. In addition, specific tests were conducted to systematically investigate the influence of the different preprocessing methods on the structure of the analyzed data. In conclusion, preprocessing methods originally developed for DNA microarray analysis, in particular, Quantile and Cubic-Spline Normalization, performed best in reducing bias, accurately detecting fold changes, and classifying samples.
引用
收藏
页码:S146 / S160
页数:15
相关论文
共 39 条
  • [31] Improved classification accuracy in 1-and 2-dimensional NMR metabolomics data using the variance stabilising generalised logarithm transformation
    Parsons, Helen M.
    Ludwig, Christian
    Guenther, Ulrich L.
    Viant, Mark R.
    [J]. BMC BIOINFORMATICS, 2007, 8 (1)
  • [32] ROCR: visualizing classifier performance in R
    Sing, T
    Sander, O
    Beerenwinkel, N
    Lengauer, T
    [J]. BIOINFORMATICS, 2005, 21 (20) : 3940 - 3941
  • [33] Automatic alignment of individual peaks in large high-resolution spectral data sets
    Stoyanova, R
    Nicholls, AW
    Nicholson, JK
    Lindon, JC
    Brown, TR
    [J]. JOURNAL OF MAGNETIC RESONANCE, 2004, 170 (02) : 329 - 335
  • [34] Centering, scaling, and transformations: improving the biological information content of metabolomics data
    van den Berg, Robert A.
    Hoefsloot, Huub C. J.
    Westerhuis, Johan A.
    Smilde, Age K.
    van der Werf, Mariet J.
    [J]. BMC GENOMICS, 2006, 7 (1)
  • [35] Bias in error estimation when using cross-validation for model selection
    Varma, S
    Simon, R
    [J]. BMC BIOINFORMATICS, 2006, 7 (1)
  • [36] Targeted profiling:: Quantitative analysis of 1H NMR metabolomics data
    Weljie, Aalim M.
    Newton, Jack
    Mercier, Pascal
    Carlson, Erin
    Slupsky, Carolyn M.
    [J]. ANALYTICAL CHEMISTRY, 2006, 78 (13) : 4430 - 4442
  • [37] Wishart DS, 2010, METHODS MOL BIOL, V593, P283, DOI 10.1007/978-1-60327-194-3_14
  • [38] WORKMAN C, 2002, GENOME BIOL, V0003
  • [39] Interdependence of Signal Processing and Analysis of Urine 1H NMR Spectra for Metabolic Profiling
    Zhang, Shucha
    Zheng, Cheng
    Lanza, Ian R.
    Nair, K. Sreekumaran
    Raftery, Daniel
    Vitek, Olga
    [J]. ANALYTICAL CHEMISTRY, 2009, 81 (15) : 6080 - 6088