Robust Accurate Identification and Biomass Estimates of Miroorganisms via Tandem Mass Spectrometry

被引:7
作者
Alves, Gelio [1 ]
Yu, Yi-Kuo [1 ]
机构
[1] Natl Lib Med, Natl Ctr Biotehnol Informat, NIH, Bethesda, MD 20894 USA
基金
美国国家卫生研究院;
关键词
STATISTICAL SIGNIFICANCE; PROTEIN IDENTIFICATION; METAPROTEOMICS; PROTEOMICS; DISEASES; CLASSIFICATION; MICROORGANISMS; CHALLENGES; SIMILARITY; GENOMICS;
D O I
10.1021/jasms.9b00035
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Rapid and accurate identification of microorganisms and estimation of their biomasses are of extreme importance to public health. Mass spectrometry has become an important technique for these purposes. Previously we published a workflow named Microorganism Classification and Identification (MiCId v.12.26.2017) that was shown to perform no worse than other workflows. This manuscript presents MiCId v.12.13.2018 that, in comparison with the earlier version v.12.26.2017, allows for biomass estimates, provides more accurate microorganism identifications (better controls the number of false positives), and is robust against database size increase. This significant advance is made possible by several new ingredients introduced: first, we apply a modified expectation-maximization method to compute for each taxon considered a prior probability, which can be used for biomass estimate; second, we introduce a new concept called ownership, through which the participation ratio is computed and use it as the number of taxa to be kept within a cluster of closely related taxa; third, based on confidently identified peptides, we calculate for each taxon its degree of independence from the rest of taxa considered to determine whether or not to split this taxon off the cluster. Using 270 data files, each containing a large number of MS/MS spectra, we show that, in comparison with v.12.26.2017, version v.12.13.2018 yields superior retrieval results. We also show that MiCId v.12.13.2018 can estimate species biomass reasonably well.
引用
收藏
页码:85 / 102
页数:35
相关论文
共 61 条
  • [1] AITKIN M, 1985, J ROY STAT SOC B MET, V47, P67
  • [2] Aluwong T, 2010, VET ITAL, V46, P137
  • [3] Alves G., 2011, PLOS ONE, V6
  • [4] RAId_DbS: Peptide identification using database searches with realistic statistics
    Alves, Gelio
    Ogurtsov, Aleksey Y.
    Yu, Yi-Kuo
    [J]. BIOLOGY DIRECT, 2007, 2 (1)
  • [5] Enhancing peptide identification confidence by combining search methods
    Alves, Gelio
    Wu, Wells W.
    Wang, Guanghui
    Shen, Rong-Fong
    Yu, Yi-Kuo
    [J]. JOURNAL OF PROTEOME RESEARCH, 2008, 7 (08) : 3102 - 3113
  • [6] Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry
    Alves, Gelio
    Wang, Guanghui
    Ogurtsov, Aleksey Y.
    Drake, Steven K.
    Gucek, Marjan
    Sacks, David B.
    Yu, Yi-Kuo
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2018, 29 (08) : 1721 - 1737
  • [7] Identification of Microorganisms by High Resolution Tandem Mass Spectrometry with Accurate Statistical Significance
    Alves, Gelio
    Wang, Guanghui
    Ogurtsov, Aleksey Y.
    Drake, Steven K.
    Gucek, Marjan
    Suffredini, Anthony F.
    Sacks, David B.
    Yu, Yi-Kuo
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2016, 27 (02) : 194 - 210
  • [8] Mass spectrometry-based protein identification with accurate statistical significance assignment
    Alves, Gelio
    Yu, Yi-Kuo
    [J]. BIOINFORMATICS, 2015, 31 (05) : 699 - 706
  • [9] Improving Peptide Identification Sensitivity in Shotgun Proteomics by Stratification of Search Space
    Alves, Gelio
    Yu, Yi-Kuo
    [J]. JOURNAL OF PROTEOME RESEARCH, 2013, 12 (06) : 2571 - 2581
  • [10] Single-cell identification in microbial communities by improved fluorescence in situ hybridization techniques
    Amann, Rudolf
    Fuchs, Bernhard M.
    [J]. NATURE REVIEWS MICROBIOLOGY, 2008, 6 (05) : 339 - 348