The impact of sequence database choice on metaproteomic results in gut microbiota studies

被引:92
作者
Tanca, Alessandro [1 ]
Palomba, Antonio [1 ]
Fraumene, Cristina [1 ]
Pagnozzi, Daniela [1 ]
Manghina, Valeria [2 ]
Deligios, Massimo [2 ]
Muth, Thilo [3 ,4 ]
Rapp, Erdmann [3 ]
Martens, Lennart [5 ,6 ,7 ]
Addis, Maria Filippa [1 ]
Uzzau, Sergio [1 ,2 ]
机构
[1] Porto Conte Ric, Sci & Technol Pk Sardinia, Tramariglio, Alghero, Italy
[2] Univ Sassari, Dept Biomed Sci, Sassari, Italy
[3] Max Planck Inst Dynam Complex Tech Syst, Magdeburg, Germany
[4] Robert Koch Inst, Res Grp Bioinformat NG 4, Berlin, Germany
[5] Univ Ghent, Dept Biochem, Ghent, Belgium
[6] VIB, Ctr Med Biotechnol, Ghent, Belgium
[7] Univ Ghent, Bioinformat Inst Ghent, Ghent, Belgium
来源
MICROBIOME | 2016年 / 4卷
关键词
Bioinformatics; Gut microbiota; Mass spectrometry; Metagenomics; Metaproteomics; PEPTIDE IDENTIFICATION; METABOLIC FUNCTIONS; MASS-SPECTROMETRY; PROTEOMICS; SEARCH; COMMUNITIES; HOST; MICE; PHYSIOLOGY; PROTEINS;
D O I
10.1186/s40168-016-0196-8
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Background: Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. Results: Here, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a "merged" database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields. Conclusions: This study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources.
引用
收藏
页数:13
相关论文
共 54 条
  • [1] Survey of the camel urinary proteome by shotgun proteomics using a multiple database search strategy
    Alhaider, Abdulqader A.
    Bayoumy, Nervana
    Argo, Evelyn
    Gader, Abdel G. M. A.
    Stead, David A.
    [J]. PROTEOMICS, 2012, 12 (22) : 3403 - 3406
  • [2] [Anonymous], 2015, NUCLEIC ACIDS RES
  • [3] Strain-resolved microbial community proteomics reveals simultaneous aerobic and anaerobic function during gastrointestinal tract colonization of a preterm infant
    Brooks, Brandon
    Mueller, Ryan S.
    Young, Jacque C.
    Morowitz, Michael J.
    Hettich, Robert L.
    Banfield, Jillian F.
    [J]. FRONTIERS IN MICROBIOLOGY, 2015, 6
  • [4] Analysis of Biostimulated Microbial Communities from Two Field Experiments Reveals Temporal and Spatial Differences in Proteome Profiles
    Callister, Stephen J.
    Wilkins, Michael J.
    Nicora, Carrie D.
    Williams, Kenneth H.
    Banfield, Jillian F.
    VerBerkmoes, Nathan C.
    Hettich, Robert L.
    N'Guessan, Lucie
    Mouser, Paula J.
    Elifantz, Hila
    Smith, Richard D.
    Loyley, Derek R.
    Lipton, Mary S.
    Long, Philip E.
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2010, 44 (23) : 8897 - 8903
  • [5] Strategies for Metagenomic-Guided Whole-Community Proteomics of Complex Microbial Environments
    Cantarel, Brandi L.
    Erickson, Alison R.
    VerBerkmoes, Nathan C.
    Erickson, Brian K.
    Carey, Patricia A.
    Pan, Chongle
    Shah, Manesh
    Mongodin, Emmanuel F.
    Jansson, Janet K.
    Fraser-Liggett, Claire M.
    Hettich, Robert L.
    [J]. PLOS ONE, 2011, 6 (11):
  • [6] QIIME allows analysis of high-throughput community sequencing data
    Caporaso, J. Gregory
    Kuczynski, Justin
    Stombaugh, Jesse
    Bittinger, Kyle
    Bushman, Frederic D.
    Costello, Elizabeth K.
    Fierer, Noah
    Pena, Antonio Gonzalez
    Goodrich, Julia K.
    Gordon, Jeffrey I.
    Huttley, Gavin A.
    Kelley, Scott T.
    Knights, Dan
    Koenig, Jeremy E.
    Ley, Ruth E.
    Lozupone, Catherine A.
    McDonald, Daniel
    Muegge, Brian D.
    Pirrung, Meg
    Reeder, Jens
    Sevinsky, Joel R.
    Tumbaugh, Peter J.
    Walters, William A.
    Widmann, Jeremy
    Yatsunenko, Tanya
    Zaneveld, Jesse
    Knight, Rob
    [J]. NATURE METHODS, 2010, 7 (05) : 335 - 336
  • [7] MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification
    Cox, Juergen
    Mann, Matthias
    [J]. NATURE BIOTECHNOLOGY, 2008, 26 (12) : 1367 - 1372
  • [8] TANDEM: matching proteins with tandem mass spectra
    Craig, R
    Beavis, RC
    [J]. BIOINFORMATICS, 2004, 20 (09) : 1466 - 1467
  • [9] High-fat diet alters gut microbiota physiology in mice
    Daniel, Hannelore
    Gholami, Amin Moghaddas
    Berry, David
    Desmarchelier, Charles
    Hahne, Hannes
    Loh, Gunnar
    Mondot, Stanislas
    Lepage, Patricia
    Rothballer, Michael
    Walker, Alesia
    Boehm, Christoph
    Wenning, Mareike
    Wagner, Michael
    Blaut, Michael
    Schmitt-Kopplin, Philippe
    Kuster, Bernhard
    Haller, Dirk
    Clavel, Thomas
    [J]. ISME JOURNAL, 2014, 8 (02) : 295 - 308
  • [10] Error filtering, pair assembly and error correction for next-generation sequencing reads
    Edgar, Robert C.
    Flyvbjerg, Henrik
    [J]. BIOINFORMATICS, 2015, 31 (21) : 3476 - 3482