MetICA: independent component analysis for high-resolution mass-spectrometry based non-targeted metabolomics

被引:18
作者
Liu, Youzhong [1 ,2 ]
Smirnov, Kirill [1 ]
Lucio, Marianna [1 ]
Gougeon, Regis D. [2 ]
Alexandre, Herve [2 ]
Schmitt-Kopplin, Philippe [1 ,3 ]
机构
[1] Helmholtz Zentrum Munchen, Dept Environm Sci, Res Unit Analyt BioGeoChem, Ingolstadter Landstr 1, D-85758 Neuherberg, Germany
[2] UMR PAM Univ Bourgogne, Inst Univ Vigne & Vin, Agrosup Dijon, Rue Claude Ladrey,BP 27877, Dijon, France
[3] Tech Univ Munich, Chair Analyt Food Chem, Alte Akad 1085354, Freising Weihenstephan, Germany
来源
BMC BIOINFORMATICS | 2016年 / 17卷
关键词
VISUALIZATION; EXPRESSION; CLASSIFICATION; ALGORITHM; SPECTRA;
D O I
10.1186/s12859-016-0970-4
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Interpreting non-targeted metabolomics data remains a challenging task. Signals from non-targeted metabolomics studies stem from a combination of biological causes, complex interactions between them and experimental bias/noise. The resulting data matrix usually contain huge number of variables and only few samples, and classical techniques using nonlinear mapping could result in computational complexity and overfitting. Independent Component Analysis (ICA) as a linear method could potentially bring more meaningful results than Principal Component Analysis (PCA). However, a major problem with most ICA algorithms is the output variations between different runs and the result of a single ICA run should be interpreted with reserve. Results: ICA was applied to simulated and experimental mass spectrometry (MS)-based non-targeted metabolomics data, under the hypothesis that underlying sources are mutually independent. Inspired from the Icasso algorithm, a new ICA method, MetICA was developed to handle the instability of ICA on complex datasets. Like the original Icasso algorithm, MetICA evaluated the algorithmic and statistical reliability of ICA runs. In addition, MetICA suggests two ways to select the optimal number of model components and gives an order of interpretation for the components obtained. Conclusions: Correlating the components obtained with prior biological knowledge allows understanding how nontargeted metabolomics data reflect biological nature and technical phenomena. We could also extract mass signals related to this information. This novel approach provides meaningful components due to their independent nature. Furthermore, it provides an innovative concept on which to base model selection: that of optimizing the number of reliable components instead of trying to fit the data. The current version of MetICA is available at https://github.com/daniellyz/MetICA.
引用
收藏
页数:14
相关论文
共 59 条
[1]   Electronic Nose Based on Independent Component Analysis Combined with Partial Least Squares and Artificial Neural Networks for Wine Prediction [J].
Aguilera, Teodoro ;
Lozano, Jesus ;
Paredes, Jose A. ;
Alvarez, Fernando J. ;
Suarez, Jose I. .
SENSORS, 2012, 12 (06) :8055-8072
[2]   Bioinformatics analysis of targeted metabolomics - Uncovering old and new tales of diabetic mice under medication [J].
Altmaier, Elisabeth ;
Ramsay, Steven L. ;
Graber, Armin ;
Mewes, Hans-Werner ;
Weinberger, Klaus M. ;
Suhre, Karsten .
ENDOCRINOLOGY, 2008, 149 (07) :3478-3489
[3]  
Amari S, 1996, ADV NEUR IN, V8, P757
[4]   A Metabolomic Approach to the Study of Wine Micro-Oxygenation [J].
Arapitsas, Panagiotis ;
Scholz, Matthias ;
Vrhovsek, Urska ;
Di Blasi, Stefano ;
Bartolini, Alessandra Biondi ;
Masuero, Domenico ;
Perenzoni, Daniele ;
Rigo, Adelio ;
Mattivi, Fulvio .
PLOS ONE, 2012, 7 (05)
[5]   Kernel independent component analysis [J].
Bach, FR ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (01) :1-48
[6]   Efficient algorithms for decision tree cross-validation [J].
Blockeel, H ;
Struyf, J .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :621-650
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]   Statistical strategies for avoiding false discoveries in metabolomics and related experiments [J].
Broadhurst, David I. ;
Kell, Douglas B. .
METABOLOMICS, 2006, 2 (04) :171-196
[9]   Pilot-scale evaluation the enological traits of a novel, aromatic wine yeast strain obtained by adaptive evolution [J].
Cadiere, Axelle ;
Aguera, Evelyne ;
Caille, Soline ;
Ortiz-Julien, Anne ;
Dequin, Sylvie .
FOOD MICROBIOLOGY, 2012, 32 (02) :332-337
[10]   Cross-validation in PCA models with the element-wise k-fold (ekf) algorithm: Practical aspects [J].
Camacho, Jose ;
Ferrer, Alberto .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2014, 131 :37-50