Comparison of normalization methods in clinical research applications of mass spectrometry-based proteomics

被引:2
作者
Dubois, Etienne [1 ]
Galindo, Antonio Nunez [1 ]
Dayon, Loiec [1 ]
Cominetti, Ornella [1 ]
机构
[1] Nestle Res, Nestle Inst Food Safety & Analyt Sci, Lausanne, Switzerland
来源
2020 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB) | 2020年
关键词
Mass spectrometry; Normalization; Obesity; Proteins; Proteomics; Tandem Mass Tags; Quantification; BIOMARKER DISCOVERY; UNWANTED VARIATION; PLASMA PROTEOME; EXPRESSION DATA; PROTEINS; QUANTIFICATION; ROBUSTNESS; THROUGHPUT; GENDER; SCALE;
D O I
10.1109/cibcb48159.2020.9277702
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Large-scale proteomic studies have to deal with unwanted variability, especially when samples originate from different centers and/or multiple analytical batches are needed. Such variability is typically added throughout all the steps of a clinical study, from biological sample collection and storage, sample preparation, spectral data acquisition, to peptide/protein quantification. In order to remove such diverse variability, normalization of the protein data is performed. There are several published works comparing normalization methods in the-omics field, but reports focusing on proteomic data generated with mass spectrometry (MS) are much fewer. Additionally, most of these studies have only dealt with small datasets. As a case study, we focused on the normalization of a large quantitative MS-based proteomic dataset obtained with isobaric tandem-mass tagging (TMT) of plasma samples from an overweight and obese pan-European cohort. Different normalization methods were evaluated, namely, standardization, quantile sample, removal of unwanted variation (RUV), ComBat, mean and median centering, and single standard normalization; some of these methods are generic while others have been specifically created to deal with genomic or metabolomic data. We checked how relationships between proteins and clinical variables were impacted after normalizing the data with the different methods. We compared the normalized datasets using an array of diagnostic plots. Some methods were well adapted for this particular large-scale shotgun proteomic dataset of human plasma samples. In particular, quantile sample normalization, RUV, mean and median centering showed very good performance, while quantile protein normalization provided results of inferior quality than those obtained with unnormalized data.
引用
收藏
页码:68 / 77
页数:10
相关论文
共 47 条
  • [41] Long chain acyl CoA synthetase 1 and gelsolin are oppositely regulated in adipogenesis and lipogenesis
    Mukherjee, Rajib
    Yun, Jong Won
    [J]. BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2012, 420 (03) : 588 - 593
  • [42] Murie Carl, 2018, Advances in Biological Regulation, V67, P128, DOI 10.1016/j.jbior.2017.11.005
  • [43] Isobaric Labeling-Based Relative Quantification in Shotgun Proteomics
    Rauniyar, Navin
    Yates, John R., III
    [J]. JOURNAL OF PROTEOME RESEARCH, 2014, 13 (12) : 5293 - 5309
  • [44] A high confidence, manually validated human blood plasma protein reference set
    Schenk, Susann
    Schoenhals, Gary J.
    de Souza, Gustavo
    Mann, Matthias
    [J]. BMC MEDICAL GENOMICS, 2008, 1 (1)
  • [45] A systematic evaluation of normalization methods in quantitative label-free proteomics
    Valikangas, Tommi
    Suomi, Tomi
    Elo, Laura L.
    [J]. BRIEFINGS IN BIOINFORMATICS, 2018, 19 (01) : 1 - 11
  • [46] NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis
    Willforss, Jakob
    Chawade, Aakash
    Levander, Fredrik
    [J]. JOURNAL OF PROTEOME RESEARCH, 2019, 18 (02) : 732 - 740
  • [47] A Proteomics Approach to the Protein Normalization Problem: Selection of Unvarying Proteins for MS-Based Proteomics and Western Blotting
    Wisniewski, Jacek R.
    Mann, Matthias
    [J]. JOURNAL OF PROTEOME RESEARCH, 2016, 15 (07) : 2321 - 2326