The impact of sequence database choice on metaproteomic results in gut microbiota studies

被引:92
作者
Tanca, Alessandro [1 ]
Palomba, Antonio [1 ]
Fraumene, Cristina [1 ]
Pagnozzi, Daniela [1 ]
Manghina, Valeria [2 ]
Deligios, Massimo [2 ]
Muth, Thilo [3 ,4 ]
Rapp, Erdmann [3 ]
Martens, Lennart [5 ,6 ,7 ]
Addis, Maria Filippa [1 ]
Uzzau, Sergio [1 ,2 ]
机构
[1] Porto Conte Ric, Sci & Technol Pk Sardinia, Tramariglio, Alghero, Italy
[2] Univ Sassari, Dept Biomed Sci, Sassari, Italy
[3] Max Planck Inst Dynam Complex Tech Syst, Magdeburg, Germany
[4] Robert Koch Inst, Res Grp Bioinformat NG 4, Berlin, Germany
[5] Univ Ghent, Dept Biochem, Ghent, Belgium
[6] VIB, Ctr Med Biotechnol, Ghent, Belgium
[7] Univ Ghent, Bioinformat Inst Ghent, Ghent, Belgium
来源
MICROBIOME | 2016年 / 4卷
关键词
Bioinformatics; Gut microbiota; Mass spectrometry; Metagenomics; Metaproteomics; PEPTIDE IDENTIFICATION; METABOLIC FUNCTIONS; MASS-SPECTROMETRY; PROTEOMICS; SEARCH; COMMUNITIES; HOST; MICE; PHYSIOLOGY; PROTEINS;
D O I
10.1186/s40168-016-0196-8
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Background: Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. Results: Here, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a "merged" database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields. Conclusions: This study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources.
引用
收藏
页数:13
相关论文
共 54 条
  • [41] Effective Leveraging of Targeted Search Spaces for Improving Peptide Identification in Tandem Mass Spectrometry Based Proteomics
    Shanmugam, Avinash K.
    Nesvizhskii, Alexey I.
    [J]. JOURNAL OF PROTEOME RESEARCH, 2015, 14 (12) : 5169 - 5178
  • [42] Combining Results of Multiple Search Engines in Proteomics
    Shteynberg, David
    Nesvizhskii, Alexey I.
    Moritz, Robert L.
    Deutsch, Eric W.
    [J]. MOLECULAR & CELLULAR PROTEOMICS, 2013, 12 (09) : 2383 - 2393
  • [43] The gut microbiota - masters of host development and physiology
    Sommer, Felix
    Baeckhed, Fredrik
    [J]. NATURE REVIEWS MICROBIOLOGY, 2013, 11 (04) : 227 - 238
  • [44] Enrichment or depletion? The impact of stool pretreatment on metaproteomic characterization of the human gut microbiota
    Tanca, Alessandro
    Palomba, Antonio
    Pisanu, Salvatore
    Addis, Maria Filippa
    Uzzau, Sergio
    [J]. PROTEOMICS, 2015, 15 (20) : 3474 - 3485
  • [45] A straightforward and efficient analytical pipeline for metaproteome characterization
    Tanca, Alessandro
    Palomba, Antonio
    Pisanu, Salvatore
    Deligios, Massimo
    Fraumene, Cristina
    Manghina, Valeria
    Pagnozzi, Daniela
    Addis, Maria Filippa
    Uzzau, Sergio
    [J]. MICROBIOME, 2014, 2
  • [46] Evaluating the Impact of Different Sequence Databases on Metaproteome Analysis: Insights from a Lab-Assembled Microbial Mixture
    Tanca, Alessandro
    Palomba, Antonio
    Deligios, Massimo
    Cubeddu, Tiziana
    Fraumene, Cristina
    Biosa, Grazia
    Pagnozzi, Daniela
    Addis, Maria Filippa
    Uzzau, Sergio
    [J]. PLOS ONE, 2013, 8 (12):
  • [47] Comparison of detergent-based sample preparation workflows for LTQ-Orbitrap analysis of the Escherichia coli proteome
    Tanca, Alessandro
    Biosa, Grazia
    Pagnozzi, Daniela
    Addis, Maria Filippa
    Uzzau, Sergio
    [J]. PROTEOMICS, 2013, 13 (17) : 2597 - 2607
  • [48] Peptide identification quality control
    Vaudel, Marc
    Burkhart, Julia M.
    Sickmann, Albert
    Martens, Lennart
    Zahedi, Rene P.
    [J]. PROTEOMICS, 2011, 11 (10) : 2105 - 2114
  • [49] Shotgun metaproteomics of the human distal gut microbiota
    Verberkmoes, Nathan C.
    Russell, Alison L.
    Shah, Manesh
    Godzik, Adam
    Rosenquist, Magnus
    Halfvarson, Jonas
    Lefsrud, Mark G.
    Apajalahti, Juha
    Tysk, Curt
    Hettich, Robert L.
    Jansson, Janet K.
    [J]. ISME JOURNAL, 2009, 3 (02) : 179 - 189
  • [50] 2016 update of the PRIDE database and its related tools
    Vizcaino, Juan Antonio
    Csordas, Attila
    del-Toro, Noemi
    Dianes, Jose A.
    Griss, Johannes
    Lavidas, Ilias
    Mayer, Gerhard
    Perez-Riverol, Yasset
    Reisinger, Florian
    Ternent, Tobias
    Xu, Qing-Wei
    Wang, Rui
    Hermjakob, Henning
    [J]. NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) : D447 - D456