Constructing metabolic association networks using high-dimensional mass spectrometry data

被引:0
|
作者
Koo, Imhoi [1 ]
Wei, Xiaoli [1 ]
Shi, Xue [1 ]
Zhou, Zhanxiang [2 ]
Kim, Seongho [3 ]
Zhang, Xiang [1 ]
机构
[1] Univ Louisville, Dept Chem, Ctr Regulatory & Environm Analyt Metabol, Louisville, KY 40292 USA
[2] Univ N Carolina, Dept Nutr, Greensboro, NC 27412 USA
[3] Wayne State Univ, Sch Med, Dept Oncol, Biostat Core,Karmanos Canc Inst, Detroit, MI 48201 USA
基金
美国国家科学基金会;
关键词
Metabolomics; Gaussian graphical model; Partial correlation; Independent component regression; Principal component regression; Partial least squares regression; Extrinsic similarity; PARTIAL LEAST-SQUARES; INDEPENDENT COMPONENT ANALYSIS; REDUCTION;
D O I
10.1016/j.chemolab.2014.07.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The goal of metabolic association networks is to identify topology of a metabolic network for a better understanding of molecular mechanisms. An accurate metabolic association network enables investigation of the functional behavior of metabolites in a cell or tissue. Gaussian Graphical model (GGM)-based methods have been widely used in genomics to infer biological networks. However, the performance of various GGM-based methods for the construction of metabolic association networks remains unknown in metabolomics. The performance of principal component regression (PCR), independent component regression (ICR), shrinkage covariance estimate (SCE), partial least squares regression (PLSR), and extrinsic similarity (ES) methods in constructing metabolic association networks was compared by estimating partial correlation coefficient matrices when the number of variables is larger than the sample size. To do this, the sample size and the network density (complexity) were considered as variables for network construction. Simulation studies show that PCR and ICR are more stable to the sample size and the network density than SCE and PLSR in terms of F1 scores. These methods were further applied to the analysis of experimental metabolomics data acquired from metabolite extract of mouse liver. For the simulated data, the proposed methods PCR and ICR outperform other methods when the network density is large, while PLSR and SCE perform better when the network density is small. As for the experimental metabolomics data, PCR and ICR discover more significant edges and perform better than PLSR and SCE when the discovered edges are evaluated using KEGG pathway. These results suggest that the metabolic network might be more complex and therefore, PCR and ICR have the advantage over PLSR and SCE in constructing the metabolic association networks. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:193 / 202
页数:10
相关论文
共 50 条
  • [21] An effective discretization method for disposing high-dimensional data
    Sang, Yu
    Qi, Heng
    Li, Keqiu
    Jin, Yingwei
    Yan, Deqin
    Gao, Shusheng
    INFORMATION SCIENCES, 2014, 270 : 73 - 91
  • [22] A geometric framework for outlier detection in high-dimensional data
    Herrmann, Moritz
    Pfisterer, Florian
    Scheipl, Fabian
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2023, 13 (03)
  • [23] Measuring the quality of projections of high-dimensional labeled data
    Benato, Barbara C.
    Falcao, Alexandre X.
    Telea, Alexandru C.
    COMPUTERS & GRAPHICS-UK, 2023, 116 : 287 - 297
  • [24] Visualizing structure and transitions in high-dimensional biological data
    Moon, Kevin R.
    van Dijk, David
    Wang, Zheng
    Gigante, Scott
    Burkhardt, Daniel B.
    Chen, William S.
    Yim, Kristina
    van den Elzen, Antonia
    Hirn, Matthew J.
    Coifman, Ronald R.
    Ivanova, Natalia B.
    Wolf, Guy
    Krishnaswamy, Smita
    NATURE BIOTECHNOLOGY, 2019, 37 (12) : 1482 - +
  • [25] Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data
    Wang, Zhu
    Wang, C. Y.
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2010, 9 (01)
  • [26] High-dimensional Density Estimation for Data Mining Tasks
    Kuleshov, Alexander
    Bernstein, Alexander
    Yanovich, Yury
    2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2017), 2017, : 523 - 530
  • [27] Dimensionality reduction for visualizing high-dimensional biological data
    Malepathirana, Tamasha
    Senanayake, Damith
    Vidanaarachchi, Rajith
    Gautam, Vini
    Halgamuge, Saman
    BIOSYSTEMS, 2022, 220
  • [28] Genome-wide association studies with high-dimensional phenotypes
    Marttinen, Pekka
    Gillberg, Jussi
    Havulinna, Aki
    Corander, Jukka
    Kaski, Samuel
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2013, 12 (04) : 413 - 431
  • [29] Visualization of High-Dimensional Data by Pairwise Fusion Matrices Using t-SNE
    Husnain, Mujtaba
    Missen, Malik Muhammad Saad
    Mumtaz, Shahzad
    Luqman, Muhammad Muzzamil
    Coustaty, Mickael
    Ogier, Jean-Marc
    SYMMETRY-BASEL, 2019, 11 (01):
  • [30] Discriminative Ridge Machine: A Classifier for High-Dimensional Data or Imbalanced Data
    Peng, Chong
    Cheng, Qiang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2595 - 2609