HAVoC, a bioinformatic pipeline for reference-based consensus assembly and lineage assignment for SARS-CoV-2 sequences

被引:31
作者
Phuoc Thien Truong Nguyen [1 ]
Plyusnin, Ilya [2 ,3 ]
Sironen, Tarja [1 ,3 ]
Vapalahti, Olli [1 ,3 ,4 ,5 ]
Kant, Ravi [1 ,3 ]
Smura, Teemu [1 ,4 ,5 ]
机构
[1] Univ Helsinki, Fac Med, Dept Virol, Helsinki, Finland
[2] Univ Helsinki, Inst Biotechnol, Helsinki, Finland
[3] Univ Helsinki, Dept Vet Biosci, Helsinki, Finland
[4] Univ Helsinki, Dept Virol, Helsinki, Finland
[5] Helsinki Univ Hosp, Helsinki, Finland
基金
芬兰科学院;
关键词
SARS-CoV2; Variant detection; Reference assembly; Lineage identification; Coronavirus; Sequence analysis; DISEASE; ALIGNMENT;
D O I
10.1186/s12859-021-04294-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background SARS-CoV-2 related research has increased in importance worldwide since December 2019. Several new variants of SARS-CoV-2 have emerged globally, of which the most notable and concerning currently are the UK variant B.1.1.7, the South African variant B1.351 and the Brazilian variant P.1. Detecting and monitoring novel variants is essential in SARS-CoV-2 surveillance. While there are several tools for assembling virus genomes and performing lineage analyses to investigate SARS-CoV-2, each is limited to performing singular or a few functions separately. Results Due to the lack of publicly available pipelines, which could perform fast reference-based assemblies on raw SARS-CoV-2 sequences in addition to identifying lineages to detect variants of concern, we have developed an open source bioinformatic pipeline called HAVoC (Helsinki university Analyzer for Variants of Concern). HAVoC can reference assemble raw sequence reads and assign the corresponding lineages to SARS-CoV-2 sequences. Conclusions HAVoC is a pipeline utilizing several bioinformatic tools to perform multiple necessary analyses for investigating genetic variance among SARS-CoV-2 samples. The pipeline is particularly useful for those who need a more accessible and fast tool to detect and monitor the spread of SARS-CoV-2 variants of concern during local outbreaks. HAVoC is currently being used in Finland for monitoring the spread of SARS-CoV-2 variants. HAVoC user manual and source code are available at and , respectively.
引用
收藏
页数:8
相关论文
共 29 条
  • [1] Bedford T, 2021, NEXTSTRAIN
  • [2] Trimmomatic: a flexible trimmer for Illumina sequence data
    Bolger, Anthony M.
    Lohse, Marc
    Usadel, Bjoern
    [J]. BIOINFORMATICS, 2014, 30 (15) : 2114 - 2120
  • [3] Evaluation of Alignment Algorithms for Discovery and Identification of Pathogens Using RNA-Seq
    Borozan, Ivan
    Watt, Stuart N.
    Ferretti, Vincent
    [J]. PLOS ONE, 2013, 8 (10):
  • [4] Centers for Disease Control and Prevention (CDC), 2020, OPEN FORUM INFECT DI, DOI DOI 10.1093/OFID/OFAA535
  • [5] fastp: an ultra-fast all-in-one FASTQ preprocessor
    Chen, Shifu
    Zhou, Yanqing
    Chen, Yaru
    Gu, Jia
    [J]. BIOINFORMATICS, 2018, 34 (17) : 884 - 890
  • [6] Dixon MG, 2014, MMWR-MORBID MORTAL W, V63, P548
  • [7] Edwards E., 2021, NBC NEWS
  • [8] Data, disease and diplomacy: GISAID's innovative contribution to global health
    Elbe, Stefan
    Buckland-Merrett, Gemma
    [J]. GLOBAL CHALLENGES, 2017, 1 (01) : 33 - 46
  • [9] Faria N. R, VIROLOGICAL, V2021
  • [10] Nextstrain: real-time tracking of pathogen evolution
    Hadfield, James
    Megill, Colin
    Bell, Sidney M.
    Huddleston, John
    Potter, Barney
    Callender, Charlton
    Sagulenko, Pavel
    Bedford, Trevor
    Neher, Richard A.
    [J]. BIOINFORMATICS, 2018, 34 (23) : 4121 - 4123