Systematic Evaluation of Normalization Methods for Glycomics Data Based on Performance of Network Inference

被引:14
作者
Benedetti, Elisa [1 ,2 ]
Gerstner, Nathalie [2 ,3 ]
Pucic-Bakovic, Maja [4 ]
Keser, Toma [5 ]
Reiding, Karli R. [6 ,7 ,8 ]
Ruhaak, L. Renee [8 ,9 ]
Stambuk, Tamara [5 ]
Selman, Maurice H. J. [6 ,7 ]
Rudan, Igor [10 ]
Polasek, Ozren [11 ,12 ]
Hayward, Caroline [13 ]
Beekman, Marian [14 ]
Slagboom, Eline [14 ]
Wuhrer, Manfred [8 ]
Dunlop, Malcolm G. [15 ,16 ]
Lauc, Gordan [4 ,5 ]
Krumsiek, Jan [1 ,2 ]
机构
[1] Weill Cornell Med, Englander Inst Precis Med, Dept Physiol & Biophys, Inst Computat Biomed, New York, NY 10022 USA
[2] German Res Ctr Environm Hlth, Inst Computat Biol, Helmholtz Zentrum Munchen, D-85764 Neuherberg, Germany
[3] Max Planck Inst Psychiat, D-80804 Munich, Germany
[4] Genos Glycosci Res Lab, Zagreb 10000, Croatia
[5] Univ Zagreb, Fac Pharm & Biochem, Zagreb 10000, Croatia
[6] Univ Utrecht, Bijvoet Ctr Biomol Res, Biomol Mass Spectrometry & Prote, NL-3584 CH Utrecht, Netherlands
[7] Univ Utrecht, Utrecht Inst Pharmaceut Sci, NL-3584 CH Utrecht, Netherlands
[8] Leiden Univ, Ctr Prote & Metabol, Med Ctr, NL-2333 ZC Leiden, Netherlands
[9] Leiden Univ, Dept Clin Chem & Lab Med, Med Ctr, NL-2333 ZC Leiden, Netherlands
[10] Univ Edinburgh, Usher Inst Populat Hlth Sci & Informat, Edinburgh EH8 9AG, Midlothian, Scotland
[11] Univ Split, Med Sch, Split 21000, Croatia
[12] Gen Info Ltd, Zagreb 10000, Croatia
[13] Univ Edinburgh, Inst Genet & Mol Med, Human Genet Unit, MRC, Edinburgh EH4 2XU, Midlothian, Scotland
[14] Leiden Univ, Sect Mol Epidemiol, Med Ctr, NL-2333 ZC Leiden, Netherlands
[15] Univ Edinburgh, Inst Genet & Mol Med, Colon Canc Genet Grp, Edinburgh EH8 9YL, Midlothian, Scotland
[16] MRC, Human Genet Unit, Edinburgh EH8 9YL, Midlothian, Scotland
基金
英国医学研究理事会;
关键词
glycomics; data normalization; gaussian graphical models; COMPOSITIONAL DATA-ANALYSIS; GLYCOSYLATION; ALLOTYPES; DISCOVERY; MODEL;
D O I
10.3390/metabo10070271
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Glycomics measurements, like all other high-throughput technologies, are subject to technical variation due to fluctuations in the experimental conditions. The removal of this non-biological signal from the data is referred to as normalization. Contrary to other omics data types, a systematic evaluation of normalization options for glycomics data has not been published so far. In this paper, we assess the quality of different normalization strategies for glycomics data with an innovative approach. It has been shown previously that Gaussian Graphical Models (GGMs) inferred from glycomics data are able to identify enzymatic steps in the glycan synthesis pathways in a data-driven fashion. Based on this finding, here, we quantify the quality of a given normalization method according to how well a GGM inferred from the respective normalized data reconstructs known synthesis reactions in the glycosylation pathway. The method therefore exploits a biological measure of goodness. We analyzed 23 different normalization combinations applied to six large-scale glycomics cohorts across three experimental platforms: Liquid Chromatography-ElectroSpray Ionization-Mass Spectrometry (LC-ESI-MS), Ultra High Performance Liquid Chromatography with Fluorescence Detection (UHPLC-FLD), and Matrix Assisted Laser Desorption Ionization-Furier Transform Ion Cyclotron Resonance-Mass Spectrometry (MALDI-FTICR-MS). Based on our results, we recommend normalizing glycan data using the 'Probabilistic Quotient' method followed by log-transformation, irrespective of the measurement platform. This recommendation is further supported by an additional analysis, where we ranked normalization methods based on their statistical associations with age, a factor known to associate with glycomics measurements.
引用
收藏
页码:1 / 16
页数:17
相关论文
共 58 条
[1]  
AITCHISON J, 1982, J ROY STAT SOC B, V44, P139
[2]   Compositional data analysis: Where are we and where should we be heading? [J].
Aitchison, J ;
Egozcue, JJ .
MATHEMATICAL GEOLOGY, 2005, 37 (07) :829-850
[3]   Logratios and natural laws in compositional data analysis [J].
Aitchison, J .
MATHEMATICAL GEOLOGY, 1999, 31 (05) :563-580
[4]   Logratio analysis and compositional distance [J].
Aitchison, J ;
Barceló-Vidal, C ;
Martín-Fernández, JA ;
Pawlowsky-Glahn, V .
MATHEMATICAL GEOLOGY, 2000, 32 (03) :271-275
[5]  
[Anonymous], 2016, SCI REP UK
[6]  
[Anonymous], 2003, GIRONA
[7]  
[Anonymous], 2005, ENCY BIOSTATISTICS
[8]  
[Anonymous], 2015, MICROB ECOL HEALTH D
[9]   DNA-SEQUENCES SPECIFIC FOR CAUCASIAN G3M(B) AND (G) ALLOTYPES - ALLOTYPING AT THE GENOMIC LEVEL [J].
BALBIN, M ;
GRUBB, A ;
DELANGE, GG ;
GRUBB, R .
IMMUNOGENETICS, 1994, 39 (03) :187-193
[10]   Network inference from glycoproteomics data reveals new reactions in the IgG glycosylation pathway [J].
Benedetti, Elisa ;
Pucic-Bakovic, Maja ;
Keser, Toma ;
Wahl, Annika ;
Hassinen, Antti ;
Yang, Jeong-Yeh ;
Liu, Lin ;
Trbojevic-Akmacic, Irena ;
Razdorov, Genadij ;
Stambuk, Jerko ;
Klaric, Lucija ;
Ugrina, Ivo ;
Selman, Maurice H. J. ;
Wuhrer, Manfred ;
Rudan, Igor ;
Polasek, Ozren ;
Hayward, Caroline ;
Grallert, Harald ;
Strauch, Konstantin ;
Peters, Annette ;
Meitinger, Thomas ;
Gieger, Christian ;
Vilaj, Marija ;
Boons, Geert-Jan ;
Moremen, Kelley W. ;
Ovchinnikova, Tatiana ;
Bovin, Nicolai ;
Kellokumpu, Sakari ;
Theis, Fabian J. ;
Lauc, Gordan ;
Krumsiek, Jan .
NATURE COMMUNICATIONS, 2017, 8