Assembly of long error-prone reads using de Bruijn graphs

被引:232
作者
Lin, Yu [1 ]
Yuan, Jeffrey [1 ]
Kolmogorov, Mikhail [1 ]
Shen, Max W. [1 ]
Chaisson, Mark [2 ]
Pevzner, Pavel A. [1 ]
机构
[1] Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA 92092 USA
[2] Univ Washington, Dept Genome Sci, Seattle, WA 98105 USA
关键词
de Bruijn graph; genome assembly; single-molecule sequencing; GENOMES; ALGORITHMS; BACTERIAL; SEQUENCE; CLASSIFICATION; CHROMOSOME; TOOL;
D O I
10.1073/pnas.1604560113
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The recent breakthroughs in assembling long error-prone reads were based on the overlap-layout-consensus (OLC) approach and did not utilize the strengths of the alternative de Bruijn graph approach to genome assembly. Moreover, these studies often assume that applications of the de Bruijn graph approach are limited to short and accurate reads and that the OLC approach is the only practical paradigm for assembling long error-prone reads. We show how to generalize de Bruijn graphs for assembling long error-prone reads and describe the ABruijn assembler, which combines the de Bruijn graph and the OLC approaches and results in accurate genome reconstructions.
引用
收藏
页码:E8396 / E8405
页数:10
相关论文
共 55 条
[1]   HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads [J].
Antipov, Dmitry ;
Korobeynikov, Anton ;
McLean, Jeffrey S. ;
Pevzner, Pavel A. .
BIOINFORMATICS, 2016, 32 (07) :1009-1015
[2]   MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island [J].
Ashton, Philip M. ;
Nair, Satheesh ;
Dallman, Tim ;
Rubino, Salvatore ;
Rabsch, Wolfgang ;
Mwaigwisya, Solomon ;
Wain, John ;
O'Grady, Justin .
NATURE BIOTECHNOLOGY, 2015, 33 (03) :296-+
[3]   Shotgun protein sequencing - Assembly of peptide tandem mass spectra from mixtures of modified proteins [J].
Bandeira, Nuno ;
Clauser, Karl R. ;
Pevzner, Pavel A. .
MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (07) :1123-1134
[4]   Automated de novo protein sequencing of monoclonal antibodies [J].
Bandeira, Nuno ;
Pham, Victoria ;
Pevzner, Pavel ;
Arnott, David ;
Lill, Jennie R. .
NATURE BIOTECHNOLOGY, 2008, 26 (12) :1336-1338
[5]   SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing [J].
Bankevich, Anton ;
Nurk, Sergey ;
Antipov, Dmitry ;
Gurevich, Alexey A. ;
Dvorkin, Mikhail ;
Kulikov, Alexander S. ;
Lesin, Valery M. ;
Nikolenko, Sergey I. ;
Son Pham ;
Prjibelski, Andrey D. ;
Pyshkin, Alexey V. ;
Sirotkin, Alexander V. ;
Vyahhi, Nikolay ;
Tesler, Glenn ;
Alekseyev, Max A. ;
Pevzner, Pavel A. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) :455-477
[6]   Assembling large genomes with single-molecule sequencing and locality-sensitive hashing [J].
Berlin, Konstantin ;
Koren, Sergey ;
Chin, Chen-Shan ;
Drake, James P. ;
Landolin, Jane M. ;
Phillippy, Adam M. .
NATURE BIOTECHNOLOGY, 2015, 33 (06) :623-+
[7]   Two New Complete Genome Sequences Offer Insight into Host and Tissue Specificity of Plant Pathogenic Xanthomonas spp. [J].
Bogdanove, Adam J. ;
Koebnik, Ralf ;
Lu, Hong ;
Furutani, Ayako ;
Angiuoli, Samuel V. ;
Patil, Prabhu B. ;
Van Sluys, Marie-Anne ;
Ryan, Robert P. ;
Meyer, Damien F. ;
Han, Sang-Wook ;
Aparna, Gudlur ;
Rajaram, Misha ;
Delcher, Arthur L. ;
Phillippy, Adam M. ;
Puiu, Daniela ;
Schatz, Michael C. ;
Shumway, Martin ;
Sommer, Daniel D. ;
Trapnell, Cole ;
Benahmed, Faiza ;
Dimitrov, George ;
Madupu, Ramana ;
Radune, Diana ;
Sullivan, Steven ;
Jha, Gopaljee ;
Ishihara, Hiromichi ;
Lee, Sang-Won ;
Pandey, Alok ;
Sharma, Vikas ;
Sriariyanun, Malinee ;
Szurek, Boris ;
Vera-Cruz, Casiana M. ;
Dorman, Karin S. ;
Ronald, Pamela C. ;
Verdier, Valerie ;
Dow, J. Maxwell ;
Sonti, Ramesh V. ;
Tsuge, Seiji ;
Brendel, Volker P. ;
Rabinowicz, Pablo D. ;
Leach, Jan E. ;
White, Frank F. ;
Salzberg, Steven L. .
JOURNAL OF BACTERIOLOGY, 2011, 193 (19) :5450-5464
[8]   Ray Meta: scalable de novo metagenome assembly and profiling [J].
Boisvert, Sebastien ;
Raymond, Frederic ;
Godzaridis, Elenie ;
Laviolette, Francois ;
Corbeil, Jacques .
GENOME BIOLOGY, 2012, 13 (12)
[9]   Immunoglobulin Classification Using the Colored Antibody Graph [J].
Bonissone, Stefano R. ;
Pevzner, Pavel A. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2016, 23 (06) :483-494
[10]   Single molecule real-time sequencing of Xanthomonas oryzae genomes reveals a dynamic structure and complex TAL (transcription activator-like) effector gene relationships [J].
Booher, Nicholas J. ;
Carpenter, Sara C. D. ;
Sebra, Robert P. ;
Wang, Li ;
Salzberg, Steven L. ;
Leach, Jan E. ;
Bogdanove, Adam J. .
MICROBIAL GENOMICS, 2015, 1 (04)