Rapid Classification and Identification of Multiple Microorganisms with Accurate Statistical Significance via High-Resolution Tandem Mass Spectrometry

被引:24
作者
Alves, Gelio [1 ]
Wang, Guanghui [2 ]
Ogurtsov, Aleksey Y. [1 ]
Drake, Steven K. [3 ]
Gucek, Marjan [2 ]
Sacks, David B. [4 ]
Yu, Yi-Kuo [1 ]
机构
[1] NIH, Natl Ctr Biotechnol Informat, Natl Lib Med, Bethesda, MD 20894 USA
[2] NHLBI, Prote Core, NIH, Bethesda, MD 20892 USA
[3] NIH, Crit Care Med Dept, Ctr Clin, Bethesda, MD 20892 USA
[4] NIH, Dept Lab Med, Ctr Clin, Bethesda, MD 20892 USA
基金
美国国家卫生研究院;
关键词
Pathogen identification; Microorganism classification; Statistical significance; Mass; Spectrometry; Proteomics; LASER-DESORPTION IONIZATION; PROTEIN IDENTIFICATION; SHOTGUN PROTEOMICS; BLOOD CULTURES; BACTERIA; SEQUENCE; DATABASE; CONFIDENCE; DISEASES; METAPROTEOMICS;
D O I
10.1007/s13361-018-1986-y
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Rapid and accurate identification and classification of microorganisms is of paramount importance to public health and safety. With the advance of mass spectrometry (MS) technology, the speed of identification can be greatly improved. However, the increasing number of microbes sequenced is complicating correct microbial identification even in a simple sample due to the large number of candidates present. To properly untwine candidate microbes in samples containing one or more microbes, one needs to go beyond apparent morphology or simple "fingerprinting"; to correctly prioritize the candidate microbes, one needs to have accurate statistical significance in microbial identification. We meet these challenges by using peptide-centric representations of microbes to better separate them and by augmenting our earlier analysis method that yields accurate statistical significance. Here, we present an updated analysis workflow that uses tandem MS (MS/MS) spectra for microbial identification or classification. We have demonstrated, using 226 MS/MS publicly available data files (each containing from 2500 to nearly 100,000 MS/MS spectra) and 4000 additional MS/MS data files, that the updated workflow can correctly identify multiple microbes at the genus and often the species level for samples containing more than one microbe. We have also shown that the proposed workflow computes accurate statistical significances, i.e., E values for identified peptides and unified E values for identified microbes. Our updated analysis workflow MiCId, a freely available software for Microorganism Classification and Identification, is available for download at https://www.ncbi.nlm.nih.gov/CBBresearch/Yu/downloads.html.
引用
收藏
页码:1721 / 1737
页数:17
相关论文
共 78 条
  • [31] Metaproteomics: Harnessing the Power of High Performance Mass Spectrometry to Identify the Suite of Proteins That Control Metabolic Activities in Microbial Communities
    Hettich, Robert L.
    Pan, Chongle
    Chourey, Karuna
    Giannone, Richard J.
    [J]. ANALYTICAL CHEMISTRY, 2013, 85 (09) : 4203 - 4214
  • [32] Next-Generation Sequencing: A Review of Technologies and Tools for Wound Microbiome Research
    Hodkinson, Brendan P.
    Grice, Elizabeth A.
    [J]. ADVANCES IN WOUND CARE, 2015, 4 (01) : 50 - 58
  • [33] IDENTIFICATION OF ENTEROBACTERIACEAE BY API 20E SYSTEM
    HOLMES, B
    WILLCOX, WR
    LAPAGE, SP
    [J]. JOURNAL OF CLINICAL PATHOLOGY, 1978, 31 (01) : 22 - 30
  • [34] MEGAN analysis of metagenomic data
    Huson, Daniel H.
    Auch, Alexander F.
    Qi, Ji
    Schuster, Stephan C.
    [J]. GENOME RESEARCH, 2007, 17 (03) : 377 - 386
  • [35] A Protein Processing Filter Method for Bacterial Identification by Mass Spectrometry-Based Proteomics
    Jabbour, Rabih E.
    Deshpande, Samir V.
    Stanford, Michael F.
    Wick, Charles H.
    Zulich, Alan W.
    Snyder, A. Peter
    [J]. JOURNAL OF PROTEOME RESEARCH, 2011, 10 (02) : 907 - 912
  • [36] Double-Blind Characterization of Non-Genome-Sequenced Bacteria by Mass Spectrometry-Based Proteomics
    Jabbour, Rabih E.
    Deshpande, Samir V.
    Wade, Mary Margaret
    Stanford, Michael F.
    Wick, Charles H.
    Zulich, Alan W.
    Skowronski, Evan W.
    Snyder, A. Peter
    [J]. APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2010, 76 (11) : 3637 - 3644
  • [37] A two-step database search method improves sensitivity in peptide sequence matches for metaproteomics and proteogenomics studies
    Jagtap, Pratik
    Goslinga, Jill
    Kooren, Joel A.
    McGowan, Thomas
    Wroblewski, Matthew S.
    Seymour, Sean L.
    Griffin, Timothy J.
    [J]. PROTEOMICS, 2013, 13 (08) : 1352 - 1357
  • [38] Metaproteomic analysis using the Galaxy framework
    Jagtap, Pratik D.
    Blakely, Alan
    Murray, Kevin
    Stewart, Shaun
    Kooren, Joel
    Johnson, James E.
    Rhodus, Nelson L.
    Rudney, Joel
    Griffin, Timothy J.
    [J]. PROTEOMICS, 2015, 15 (20) : 3553 - 3565
  • [39] 16S rRNA gene sequencing for bacterial identification in the diagnostic laboratory: Pluses, perils, and pitfalls
    Janda, J. Michael
    Abbott, Sharon L.
    [J]. JOURNAL OF CLINICAL MICROBIOLOGY, 2007, 45 (09) : 2761 - 2764
  • [40] Genome sequence of Shigella flexneri 2a:: insights into pathogenicity through comparison with genomes of Escherichia coli K12 and O157
    Jin, Q
    Yuan, ZH
    Xu, JG
    Wang, Y
    Shen, Y
    Lu, WC
    Wang, JH
    Liu, H
    Yang, J
    Yang, F
    Zhang, XB
    Zhang, JY
    Yang, GW
    Wu, HT
    Qu, D
    Dong, J
    Sun, LL
    Xue, Y
    Zhao, AL
    Gao, YS
    Zhu, JP
    Kan, B
    Ding, KY
    Chen, SX
    Cheng, HS
    Yao, ZJ
    He, BK
    Chen, RS
    Ma, DL
    Qiang, BQ
    Wen, YM
    Hou, YD
    Yu, J
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (20) : 4432 - 4441