cloudSPAdes: assembly of synthetic long reads using de Bruijn graphs

被引:19
作者
Tolstoganov, Ivan [1 ]
Bankevich, Anton [2 ]
Chen, Zhoutao [3 ]
Pevzner, Pavel A. [1 ,2 ]
机构
[1] St Petersburg State Univ, Inst Translat Biomed, Ctr Algorithm Biotechnol, St Petersburg, Russia
[2] Univ Calif San Diego, Dept Comp Sci & Engn, La Jolla, CA 92093 USA
[3] Universal Sequencing Technol Corp, Carlsbad, CA USA
基金
俄罗斯科学基金会;
关键词
DNA EXTRACTION; GENOME; ACCURATE;
D O I
10.1093/bioinformatics/btz349
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation The recently developed barcoding-based synthetic long read (SLR) technologies have already found many applications in genome assembly and analysis. However, although some new barcoding protocols are emerging and the range of SLR applications is being expanded, the existing SLR assemblers are optimized for a narrow range of parameters and are not easily extendable to new barcoding technologies and new applications such as metagenomics or hybrid assembly. Results We describe the algorithmic challenge of the SLR assembly and present a cloudSPAdes algorithm for SLR assembly that is based on analyzing the de Bruijn graph of SLRs. We benchmarked cloudSPAdes across various barcoding technologies/applications and demonstrated that it improves on the state-of-the-art SLR assemblers in accuracy and speed. Availability and implementation Source code and installation manual for cloudSPAdes are available at https://github.com/ablab/spades/releases/tag/cloudspades-paper. Supplementary Information Supplementary data are available at Bioinformatics online.
引用
收藏
页码:I61 / I70
页数:10
相关论文
共 31 条
[1]   In vitro, long-range sequence information for de novo genome assembly via transposase contiguity [J].
Adey, Andrew ;
Kitzman, Jacob O. ;
Burton, Joshua N. ;
Daza, Riza ;
Kumar, Akash ;
Christiansen, Lena ;
Ronaghi, Mostafa ;
Amini, Sasan ;
Gunderson, Kevin L. ;
Steemers, Frank J. ;
Shendure, Jay .
GENOME RESEARCH, 2014, 24 (12) :2041-2049
[2]   PHYSICAL MAPPING OF CHROMOSOMES - A COMBINATORIAL PROBLEM IN MOLECULAR-BIOLOGY [J].
ALIZADEH, F ;
KARP, RM ;
NEWBERG, LA ;
WEISSER, DK .
ALGORITHMICA, 1995, 13 (1-2) :52-76
[3]  
[Anonymous], 1990, Eulerian graphs and related topics
[4]   An Improved Method for High Quality Metagenomics DNA Extraction from Human and Environmental Samples [J].
Bag, Satyabrata ;
Saha, Bipasa ;
Mehta, Ojasvi ;
Anbumani, D. ;
Kumar, Naveen ;
Dayal, Mayanka ;
Pant, Archana ;
Kumar, Pawan ;
Saxena, Shruti ;
Allin, Kristine H. ;
Hansen, Torben ;
Arumugam, Manimozhiyan ;
Vestergaard, Henrik ;
Pedersen, Oluf ;
Pereira, Verima ;
Abraham, Philip ;
Tripathi, Reva ;
Wadhwa, Nitya ;
Bhatnagar, Shinjini ;
Prakash, Visvanathan Gnana ;
Radha, Venkatesan ;
Anjana, R. M. ;
Mohan, V. ;
Takeda, Kiyoshi ;
Kurakawa, Takashi ;
Nair, G. Balakrish ;
Das, Bhabatosh .
SCIENTIFIC REPORTS, 2016, 6
[5]   Joint Analysis of Long and Short Reads Enables Accurate Estimates of Microbiome Complexity [J].
Bankevich, Anton ;
Pevzner, Pavel A. .
CELL SYSTEMS, 2018, 7 (02) :192-+
[6]  
Bankevich A, 2016, NAT METHODS, V13, P248, DOI [10.1038/nmeth.3737, 10.1038/NMETH.3737]
[7]   SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing [J].
Bankevich, Anton ;
Nurk, Sergey ;
Antipov, Dmitry ;
Gurevich, Alexey A. ;
Dvorkin, Mikhail ;
Kulikov, Alexander S. ;
Lesin, Valery M. ;
Nikolenko, Sergey I. ;
Son Pham ;
Prjibelski, Andrey D. ;
Pyshkin, Alexey V. ;
Sirotkin, Alexander V. ;
Vyahhi, Nikolay ;
Tesler, Glenn ;
Alekseyev, Max A. ;
Pevzner, Pavel A. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) :455-477
[8]  
Batzoglou S., 1999, Combinatorial Pattern Matching. 10th Annual Symposium, CPM 99. Proceedings (Lecture Notes in Computer Science Vol.1645), P66
[9]   High-quality genome sequences of uncultured microbes by assembly of read clouds [J].
Bishara, Alex ;
Moss, Eli L. ;
Kolmogorov, Mikhail ;
Parada, Alma E. ;
Weng, Ziming ;
Sidow, Arend ;
Dekas, Anne E. ;
Batzoglou, Serafim ;
Bhatt, Ami S. .
NATURE BIOTECHNOLOGY, 2018, 36 (11) :1067-+
[10]   Minerva: an alignment- and reference-free approach to deconvolve Linked-Reads for metagenomics [J].
Danko, David C. ;
Meleshko, Dmitry ;
Bezdan, Daniela ;
Mason, Christopher ;
Hajirasouliha, Iman .
GENOME RESEARCH, 2019, 29 (01) :116-124