cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs

被引:19
作者
Tolstoganov, Ivan [1 ]
Bankevich, Anton [2 ]
Chen, Zhoutao [3 ]
Pevzner, Pavel A. [1 ,2 ]
机构
[1] St Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg, Russia
[2] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
[3] Universal Sequencing Technol Corp, Carlsbad, CA USA
基金
俄罗斯科学基金会;
关键词
DNA EXTRACTION; GENOME; ACCURATE;
D O I
10.1093/bioinformatics/btz349
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation The recently developed barcoding-based synthetic long read (SLR) technologies have already found many applications in genome assembly and analysis. However, although some new barcoding protocols are emerging and the range of SLR applications is being expanded, the existing SLR assemblers are optimized for a narrow range of parameters and are not easily extendable to new barcoding technologies and new applications such as metagenomics or hybrid assembly. Results We describe the algorithmic challenge of the SLR assembly and present a cloudSPAdes algorithm for SLR assembly that is based on analyzing the de Bruijn graph of SLRs. We benchmarked cloudSPAdes across various barcoding technologies/applications and demonstrated that it improves on the state-of-the-art SLR assemblers in accuracy and speed. Availability and implementation Source code and installation manual for cloudSPAdes are available at https://github.com/ablab/spades/releases/tag/cloudspades-paper. Supplementary Information Supplementary data are available at Bioinformatics online.
引用
收藏
页码:I61 / I70
页数:10
相关论文
共 31 条
  • [1] In vitro, long-range sequence information for de novo genome assembly via transposase contiguity
    Adey, Andrew
    Kitzman, Jacob O.
    Burton, Joshua N.
    Daza, Riza
    Kumar, Akash
    Christiansen, Lena
    Ronaghi, Mostafa
    Amini, Sasan
    Gunderson, Kevin L.
    Steemers, Frank J.
    Shendure, Jay
    [J]. GENOME RESEARCH, 2014, 24 (12) : 2041 - 2049
  • [2] PHYSICAL MAPPING OF CHROMOSOMES - A COMBINATORIAL PROBLEM IN MOLECULAR-BIOLOGY
    ALIZADEH, F
    KARP, RM
    NEWBERG, LA
    WEISSER, DK
    [J]. ALGORITHMICA, 1995, 13 (1-2) : 52 - 76
  • [3] [Anonymous], 1990, Eulerian graphs and related topics
  • [4] An Improved Method for High Quality Metagenomics DNA Extraction from Human and Environmental Samples
    Bag, Satyabrata
    Saha, Bipasa
    Mehta, Ojasvi
    Anbumani, D.
    Kumar, Naveen
    Dayal, Mayanka
    Pant, Archana
    Kumar, Pawan
    Saxena, Shruti
    Allin, Kristine H.
    Hansen, Torben
    Arumugam, Manimozhiyan
    Vestergaard, Henrik
    Pedersen, Oluf
    Pereira, Verima
    Abraham, Philip
    Tripathi, Reva
    Wadhwa, Nitya
    Bhatnagar, Shinjini
    Prakash, Visvanathan Gnana
    Radha, Venkatesan
    Anjana, R. M.
    Mohan, V.
    Takeda, Kiyoshi
    Kurakawa, Takashi
    Nair, G. Balakrish
    Das, Bhabatosh
    [J]. SCIENTIFIC REPORTS, 2016, 6
  • [5] Joint Analysis of Long and Short Reads Enables Accurate Estimates of Microbiome Complexity
    Bankevich, Anton
    Pevzner, Pavel A.
    [J]. CELL SYSTEMS, 2018, 7 (02) : 192 - +
  • [6] Bankevich A, 2016, NAT METHODS, V13, P248, DOI [10.1038/nmeth.3737, 10.1038/NMETH.3737]
  • [7] SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing
    Bankevich, Anton
    Nurk, Sergey
    Antipov, Dmitry
    Gurevich, Alexey A.
    Dvorkin, Mikhail
    Kulikov, Alexander S.
    Lesin, Valery M.
    Nikolenko, Sergey I.
    Son Pham
    Prjibelski, Andrey D.
    Pyshkin, Alexey V.
    Sirotkin, Alexander V.
    Vyahhi, Nikolay
    Tesler, Glenn
    Alekseyev, Max A.
    Pevzner, Pavel A.
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) : 455 - 477
  • [8] Batzoglou S., 1999, Combinatorial Pattern Matching. 10th Annual Symposium, CPM 99. Proceedings (Lecture Notes in Computer Science Vol.1645), P66
  • [9] High-quality genome sequences of uncultured microbes by assembly of read clouds
    Bishara, Alex
    Moss, Eli L.
    Kolmogorov, Mikhail
    Parada, Alma E.
    Weng, Ziming
    Sidow, Arend
    Dekas, Anne E.
    Batzoglou, Serafim
    Bhatt, Ami S.
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (11) : 1067 - +
  • [10] Minerva: an alignment- and reference-free approach to deconvolve Linked-Reads for metagenomics
    Danko, David C.
    Meleshko, Dmitry
    Bezdan, Daniela
    Mason, Christopher
    Hajirasouliha, Iman
    [J]. GENOME RESEARCH, 2019, 29 (01) : 116 - 124