Constructing metabolic association networks using high-dimensional mass spectrometry data

被引：0

作者：

Koo, Imhoi ^{[1
]}

Wei, Xiaoli ^{[1
]}

Shi, Xue ^{[1
]}

Zhou, Zhanxiang ^{[2
]}

Kim, Seongho ^{[3
]}

Zhang, Xiang ^{[1
]}

机构：

[1] Univ Louisville, Dept Chem, Ctr Regulatory & Environm Analyt Metabol, Louisville, KY 40292 USA

[2] Univ N Carolina, Dept Nutr, Greensboro, NC 27412 USA

[3] Wayne State Univ, Sch Med, Dept Oncol, Biostat Core,Karmanos Canc Inst, Detroit, MI 48201 USA

来源：

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS | 2014年 / 138卷

基金：

美国国家科学基金会;

关键词：

Metabolomics; Gaussian graphical model; Partial correlation; Independent component regression; Principal component regression; Partial least squares regression; Extrinsic similarity; PARTIAL LEAST-SQUARES; INDEPENDENT COMPONENT ANALYSIS; REDUCTION;

D O I：

10.1016/j.chemolab.2014.07.002

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The goal of metabolic association networks is to identify topology of a metabolic network for a better understanding of molecular mechanisms. An accurate metabolic association network enables investigation of the functional behavior of metabolites in a cell or tissue. Gaussian Graphical model (GGM)-based methods have been widely used in genomics to infer biological networks. However, the performance of various GGM-based methods for the construction of metabolic association networks remains unknown in metabolomics. The performance of principal component regression (PCR), independent component regression (ICR), shrinkage covariance estimate (SCE), partial least squares regression (PLSR), and extrinsic similarity (ES) methods in constructing metabolic association networks was compared by estimating partial correlation coefficient matrices when the number of variables is larger than the sample size. To do this, the sample size and the network density (complexity) were considered as variables for network construction. Simulation studies show that PCR and ICR are more stable to the sample size and the network density than SCE and PLSR in terms of F1 scores. These methods were further applied to the analysis of experimental metabolomics data acquired from metabolite extract of mouse liver. For the simulated data, the proposed methods PCR and ICR outperform other methods when the network density is large, while PLSR and SCE perform better when the network density is small. As for the experimental metabolomics data, PCR and ICR discover more significant edges and perform better than PLSR and SCE when the discovered edges are evaluated using KEGG pathway. These results suggest that the metabolic network might be more complex and therefore, PCR and ICR have the advantage over PLSR and SCE in constructing the metabolic association networks. (C) 2014 Elsevier B.V. All rights reserved.

引用

页码：193 / 202

页数：10

共 50 条

[31] Comparative analysis of false discovery rate methods in constructing metabolic association networks
Koo, Imhoi
Yao, Sen
Zhang, Xiang
Kim, Seongho
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2014, 12 (04)
[32] TopoMap: A 0-dimensional Homology Preserving Projection of High-Dimensional Data
Doraiswamy, Harish
Tierny, Julien
Silva, Paulo J. S.
Nonato, Luis Gustavo
Silva, Claudio
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (02) : 561 - 571
[33] ON THE CONDITIONAL DISTRIBUTIONS OF LOW-DIMENSIONAL PROJECTIONS FROM HIGH-DIMENSIONAL DATA
Leeb, Hannes
ANNALS OF STATISTICS, 2013, 41 (02) : 464 - 483
[34] Robust statistical methods for high-dimensional data, with applications in tribology
Pfeiffer, Pia
Filzmoser, Peter
ANALYTICA CHIMICA ACTA, 2023, 1279
[35] Exploring high-dimensional data through locally enhanced projections
Lai, Chufan
Zhao, Ying
Yuan, Xiaoru
JOURNAL OF VISUAL LANGUAGES AND COMPUTING, 2018, 48 : 144 - 156
[36] Two-Way Analysis of High-Dimensional Collinear Data
Huopaniemi, Ilkka
Suvitaival, Tommi
Nikkila, Janne
Oresic, Matej
Kaski, Samuel
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 33 - 33
[37] Two-way analysis of high-dimensional collinear data
Huopaniemi, Ilkka
Suvitaival, Tommi
Nikkila, Janne
Oresic, Matej
Kaski, Samuel
DATA MINING AND KNOWLEDGE DISCOVERY, 2009, 19 (02) : 261 - 276
[38] High-dimensional spectral data classification with nonparametric feature screening
Li, Chuan-Quan
Xu, Qing-Song
JOURNAL OF CHEMOMETRICS, 2020, 34 (03)
[39] GAUSSIAN PROCESSES FOR HIGH-DIMENSIONAL, LARGE DATA SETS: A REVIEW
Jiang, Mengrui
Pedrielli, Giulia
Szu Hui Ng
2022 WINTER SIMULATION CONFERENCE (WSC), 2022, : 49 - 60
[40] Handling high-dimensional data in air pollution forecasting tasks
Domanska, Diana
Lukasik, Szymon
ECOLOGICAL INFORMATICS, 2016, 34 : 70 - 91

← 1 2 3 4 5 →