An Integrated Pipeline for Annotation and Visualization of Metagenomic Contigs

被引:88
作者
Dong, Xiaoli [1 ]
Strous, Marc [1 ]
机构
[1] Univ Calgary, Dept Geosci, Calgary, AB, Canada
基金
加拿大自然科学与工程研究理事会; 加拿大创新基金会;
关键词
metagenomics; metaproteomics; bioinformatics; gene prediction; functional annotation; taxonomic classification; pathway prediction; visualization; GENOME; TOPOLOGY; RESOURCE; SEARCH; GENES;
D O I
10.3389/fgene.2019.00999
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Here, we describe MetaErg, a standalone and fully automated metagenome and metaproteome annotation pipeline. Annotation of metagenomes is challenging. First, metagenomes contain sequence data of many organisms from all domains of life. Second, many of these are from understudied lineages, encoding genes with low similarity to experimentally validated reference genes. Third, assembly and binning are not perfect, sometimes resulting in artifactual hybrid contigs or genomes. To address these challenges, MetaErg provides graphical summaries of annotation outcomes, both for the complete metagenome and for individual metagenome-assembled genomes (MAGs). It performs a comprehensive annotation of each gene, including taxonomic classification, enabling functional inferences despite low similarity to reference genes, as well as detection of potential assembly or binning artifacts. When provided with metaproteome information, it visualizes gene and pathway activity using sequencing coverage and proteomic spectral counts, respectively. For visualization, MetaErg provides an HTML interface, bringing all annotation results together, and producing sortable and searchable tables, collapsible trees, and other graphic representations enabling intuitive navigation of complex data. MetaErg, implemented in Perl, HTML, and JavaScript, is a fully open source application, distributed under Academic Free License at https://github.com/xiaoli-dong/metaerg. MetaErg is also available as a docker image at https://hub.docker.com/r/xiaolidong/ docker-metaerg.
引用
收藏
页数:10
相关论文
共 39 条
  • [1] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [2] Thousands of microbial genomes shed light on interconnected biogeochemical processes in an aquifer system
    Anantharaman, Karthik
    Brown, Christopher T.
    Hug, Laura A.
    Sharon, Itai
    Castelle, Cindy J.
    Probst, Alexander J.
    Thomas, Brian C.
    Singh, Andrea
    Wilkins, Michael J.
    Karaoz, Ulas
    Brodie, Eoin L.
    Williams, Kenneth H.
    Hubbard, Susan S.
    Banfield, Jillian F.
    [J]. NATURE COMMUNICATIONS, 2016, 7
  • [3] SignalP 5.0 improves signal peptide predictions using deep neural networks
    Armenteros, Jose Juan Almagro
    Tsirigos, Konstantinos D.
    Sonderby, Casper Kaae
    Petersen, Thomas Nordahl
    Winther, Ole
    Brunak, Soren
    von Heijne, Gunnar
    Nielsen, Henrik
    [J]. NATURE BIOTECHNOLOGY, 2019, 37 (04) : 420 - +
  • [4] The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000
    Bairoch, A
    Apweiler, R
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 45 - 48
  • [5] Bateman A, 2004, NUCLEIC ACIDS RES, V32, pD138, DOI [10.1093/nar/gkr1065, 10.1093/nar/gkp985, 10.1093/nar/gkh121]
  • [6] Fast and sensitive protein alignment using DIAMOND
    Buchfink, Benjamin
    Xie, Chao
    Huson, Daniel H.
    [J]. NATURE METHODS, 2015, 12 (01) : 59 - 60
  • [7] New CRISPR-Cas systems from uncultivated microbes
    Burstein, David
    Harrington, Lucas B.
    Strutt, Steven C.
    Probst, Alexander J.
    Anantharaman, Karthik
    Thomas, Brian C.
    Doudna, Jennifer A.
    Banfield, Jillian F.
    [J]. NATURE, 2017, 542 (7640) : 237 - 241
  • [8] Bushnell B., 2014, BBMAP FAST ACCURATE
  • [9] IMG/M: integrated genome and metagenome comparative data analysis system
    Chen, I-Min A.
    Markowitz, Victor M.
    Chu, Ken
    Palaniappan, Krishna
    Szeto, Ernest
    Pillay, Manoj
    Ratner, Anna
    Huang, Jinghua
    Andersen, Evan
    Huntemann, Marcel
    Varghese, Neha
    Hadjithomas, Michalis
    Tennessen, Kristin
    Nielsen, Torben
    Ivanova, Natalia N.
    Kyrpides, Nikos C.
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D507 - D516
  • [10] WHAM!: a web-based visualization suite for user-defined analysis of metagenomic shotgun sequencing data
    Devlin, Joseph C.
    Battaglia, Thomas
    Blaser, Martin J.
    Ruggles, Kelly V.
    [J]. BMC GENOMICS, 2018, 19