Bayesian Independent Component Analysis Recovers Pathway Signatures from Blood Metabolomics Data

被引:21
作者
Krumsiek, Jan [1 ]
Suhre, Karsten [1 ]
Illig, Thomas [2 ,3 ]
Adamski, Jerzy [4 ,5 ]
Theis, Fabian J. [1 ,6 ]
机构
[1] Helmholtz Zentrum Munchen, Inst Bioinformat & Syst Biol, Munich, Germany
[2] Helmholtz Zentrum Munchen, Res Unit Mol Epidemiol, Munich, Germany
[3] Hannover Med Sch, Biobank, Hannover, Germany
[4] Helmholtz Zentrum Munchen, Genome Anal Ctr, Inst Expt Genet, Munich, Germany
[5] Tech Univ Munich, Lehrstuhl Expt Genet, D-85350 Freising Weihenstephan, Germany
[6] Tech Univ Munich, Dept Math, D-85350 Freising Weihenstephan, Germany
基金
欧洲研究理事会;
关键词
metabolomics; independent component analysis; Bayesian; systems biology; bioinformatics; blood serum; population cohorts; FMRI DATA; PROFILES; CLASSIFICATION;
D O I
10.1021/pr300231n
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Interpreting the complex interplay of metabolites in heterogeneous biosamples still poses a challenging task. In this study, we propose independent component analysis (ICA) as a multivariate analysis tool for the interpretation of large-scale metabolomics data. In particular, we employ a Bayesian ICA method based on a mean-field approach, which allows us to statistically infer the number of independent components to be reconstructed. The advantage of ICA over correlation-based methods like principal component analysis (PCA) is the utilization of higher order statistical dependencies, which not only yield additional information but also allow a more meaningful representation of the data with fewer components. We performed the described ICA approach on a large-scale metabolomics data set of human serum samples, comprising a total of 1764 study probands with 218 measured metabolites. Inspecting the source matrix of statistically independent metabolite profiles using a weighted enrichment algorithm, we observe strong enrichment of specific metabolic pathways in all components. This includes signatures from amino acid metabolism, energy-related processes, carbohydrate metabolism, and lipid metabolism. Our results imply that the human blood metabolome is composed of a distinct set of overlaying, statistically independent signals. ICA furthermore produces a mixing matrix, describing the strength of each independent component for each of the study probands. Correlating these values with plasma high-density lipoprotein (HDL) levels, we establish a novel association between HDL plasma levels and the branched-chain amino acid pathway. We conclude that the Bayesian ICA methodology has the power and flexibility to replace many of the nowadays common PCA and clustering-based analyses common in the research field.
引用
收藏
页码:4120 / 4131
页数:12
相关论文
共 51 条
  • [1] Bioinformatics analysis of targeted metabolomics - Uncovering old and new tales of diabetic mice under medication
    Altmaier, Elisabeth
    Ramsay, Steven L.
    Graber, Armin
    Mewes, Hans-Werner
    Weinberger, Klaus M.
    Suhre, Karsten
    [J]. ENDOCRINOLOGY, 2008, 149 (07) : 3478 - 3489
  • [2] [Anonymous], 2005, International Journal of Advance Research in Computer Science and Management Studies
  • [3] Belouchran A., 1995, P NOLTA, P49
  • [4] Lipid control in patients with diabetes mellitus
    Betteridge, D. John
    [J]. NATURE REVIEWS CARDIOLOGY, 2011, 8 (05) : 278 - 290
  • [5] Postprandial differences in the plasma metabolome of healthy Finnish subjects after intake of a sourdough fermented endosperm rye bread versus white wheat bread
    Bondia-Pons, Isabel
    Nordlund, Emilia
    Mattila, Ismo
    Katina, Kati
    Aura, Anna-Marja
    Kolehmainen, Marjukka
    Oresic, Matej
    Mykkanen, Hannu
    Poutanen, Kaisa
    [J]. NUTRITION JOURNAL, 2011, 10
  • [6] Biological activities of HDL subpopulations and their relevance to cardiovascular disease
    Camont, Laurent
    Chapman, M. John
    Kontush, Anatol
    [J]. TRENDS IN MOLECULAR MEDICINE, 2011, 17 (10) : 594 - 603
  • [7] INDEPENDENT COMPONENT ANALYSIS, A NEW CONCEPT
    COMON, P
    [J]. SIGNAL PROCESSING, 1994, 36 (03) : 287 - 314
  • [8] Weighted Correlation Network Analysis (WGCNA) Applied to the Tomato Fruit Metabolome
    DiLeo, Matthew V.
    Strahan, Gary D.
    den Bakker, Meghan
    Hoekenga, Owen A.
    [J]. PLOS ONE, 2011, 6 (10):
  • [9] Fahrmeir L., 2009, Regression-Modelle, Methoden und Anwendungen, V2nd
  • [10] Measurement of dietary exposure: a challenging problem which may be overcome thanks to metabolomics?
    Fave, Gaelle
    Beckmann, M. E.
    Draper, J. H.
    Mathers, J. C.
    [J]. GENES AND NUTRITION, 2009, 4 (02) : 135 - 141