Hecatomb: an integrated software platform for viral metagenomics

被引:1
作者
Roach, Michael J. [1 ,2 ,3 ]
Beecroft, Sarah J. [4 ]
Mihindukulasuriya, Kathie A. [5 ,6 ]
Wang, Leran [5 ,6 ]
Paredes, Anne [5 ]
Cardenas, Luis Alberto Chica [5 ,6 ]
Henry-Cocks, Kara [1 ]
Lima, Lais Farias Oliveira [7 ]
Dinsdale, Elizabeth A. [1 ]
Edwards, Robert A. [1 ]
Handley, Scott A. [5 ,6 ]
机构
[1] Flinders Univ S Australia, Adelaide, SA, Australia
[2] Univ Adelaide, Adelaide Ctr Epigenet, Adelaide, SA 5005, Australia
[3] Univ Adelaide, South Australian Immunogen Canc Inst, Adelaide, SA 5005, Australia
[4] Harry Perkins Inst Med Res, Perth, WA 6009, Australia
[5] Washington Univ, Sch Med, Dept Pathol & Immunol, St Louis, MO 63110 USA
[6] Washington Univ, Sch Med, Edison Family Ctr Genome Sci & Syst Biol, St Louis, MO 63110 USA
[7] San Diego State Univ, Biol Dept, San Diego, CA 92182 USA
来源
GIGASCIENCE | 2024年 / 13卷
基金
美国国家卫生研究院;
关键词
virome; virus discovery; bioinformatic workflow; viral metagenomics; BACTERIAL MICROBIOME; VIRUS DISCOVERY; VIROME; IMMUNODEFICIENCY; COMMUNITY; ASSEMBLER; MEGAHIT; ORIGINS;
D O I
10.1093/gigascience/giae020
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Modern sequencing technologies offer extraordinary opportunities for virus discovery and virome analysis. Annotation of viral sequences from metagenomic data requires a complex series of steps to ensure accurate annotation of individual reads and assembled contigs. In addition, varying study designs will require project-specific statistical analyses.Findings Here we introduce Hecatomb, a bioinformatic platform coordinating commonly used tasks required for virome analysis. Hecatomb means "a great sacrifice." In this setting, Hecatomb is "sacrificing" false-positive viral annotations using extensive quality control and tiered-database searches. Hecatomb processes metagenomic data obtained from both short- and long-read sequencing technologies, providing annotations to individual sequences and assembled contigs. Results are provided in commonly used data formats useful for downstream analysis. Here we demonstrate the functionality of Hecatomb through the reanalysis of a primate enteric and a novel coral reef virome.Conclusion Hecatomb provides an integrated platform to manage many commonly used steps for virome characterization, including rigorous quality control, host removal, and both read- and contig-based analysis. Each step is managed using the Snakemake workflow manager with dependency management using Conda. Hecatomb outputs several tables properly formatted for immediate use within popular data analysis and visualization tools, enabling effective data interpretation for a variety of study designs. Hecatomb is hosted on GitHub (github.com/shandley/hecatomb) and is available for installation from Bioconda and PyPI.
引用
收藏
页数:16
相关论文
共 119 条
  • [31] Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks
    Jang, Ho Bin
    Bolduc, Benjamin
    Zablocki, Olivier
    Kuhn, Jens H.
    Roux, Simon
    Adriaenssens, Evelien M.
    Brister, J. Rodney
    Kropinski, Andrew M.
    Krupovic, Mart
    Lavigne, Rob
    Turner, Dann
    Sullivan, Matthew B.
    [J]. NATURE BIOTECHNOLOGY, 2019, 37 (06) : 632 - +
  • [32] MetaPhinder-Identifying Bacteriophage Sequences in Metagenomic Data Sets
    Jurtz, Vanessa Isabell
    Villarroel, Julia
    Lund, Ole
    Larsen, Mette Voldby
    Nielsen, Morten
    [J]. PLOS ONE, 2016, 11 (09):
  • [33] IDseq-An open source cloud-based pipeline and analysis service for metagenomic pathogen detection and monitoring
    Kalantar, Katrina L.
    Carvalho, Tiago
    de Bourcy, Charles F. A.
    Dimitrov, Boris
    Dingle, Greg
    Egger, Rebecca
    Han, Julie
    Holmes, Olivia B.
    Juan, Yun-Fang
    King, Ryan
    Kislyuk, Andrey
    Lin, Michael F.
    Mariano, Maria
    Morse, Todd
    Reynoso, Lucia, V
    Cruz, David Rissato
    Sheu, Jonathan
    Tang, Jennifer
    Wang, James
    Zhang, Mark A.
    Zhong, Emily
    Ahyong, Vida
    Lay, Sreyngim
    Chea, Sophana
    Bohl, Jennifer A.
    Manning, Jessica E.
    Tato, Cristina M.
    DeRisi, Joseph L.
    [J]. GIGASCIENCE, 2020, 9 (10):
  • [34] Kang HS, 2017, bioRxiv, DOI [10.1101/114819, 10.1101/114819, DOI 10.1101/114819]
  • [35] Virus genomics: what is being overlooked?
    Kieft, Kristopher
    Anantharaman, Karthik
    [J]. CURRENT OPINION IN VIROLOGY, 2022, 53
  • [36] VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences
    Kieft, Kristopher
    Zhou, Zhichao
    Anantharaman, Karthik
    [J]. MICROBIOME, 2020, 8 (01)
  • [37] Enteric virome negatively affects seroconversion following oral rotavirus vaccination in a sampled cohort of Ghanaian infants
    Kim, Andrew HyoungJin
    Armah, George
    Dennis, Francis
    Wang, Leran
    Rodgers, Rachel
    Droit, Lindsay
    Baldridge, Megan T.
    Handley, Scott A.
    Harris, Vanessa C.
    [J]. CELL HOST & MICROBE, 2022, 30 (01) : 110 - +
  • [38] Centrifuge: rapid and sensitive classification of metagenomic sequences
    Kim, Daehwan
    Song, Li
    Breitwieser, Florian P.
    Salzberg, Steven L.
    [J]. GENOME RESEARCH, 2016, 26 (12) : 1721 - 1729
  • [39] Snakemake-a scalable bioinformatics workflow engine
    Koester, Johannes
    Rahmann, Sven
    [J]. BIOINFORMATICS, 2012, 28 (19) : 2520 - 2522
  • [40] Assembly of long, error-prone reads using repeat graphs
    Kolmogorov, Mikhail
    Yuan, Jeffrey
    Lin, Yu
    Pevzner, Pavel A.
    [J]. NATURE BIOTECHNOLOGY, 2019, 37 (05) : 540 - +