A hybrid approach for the automated finishing of bacterial genomes

被引:134
作者
Bashir, Ali [1 ,2 ]
Klammer, Aaron A. [1 ]
Robins, William P. [3 ]
Chin, Chen-Shan [1 ]
Webster, Dale [1 ]
Paxinos, Ellen [1 ]
Hsu, David [1 ]
Ashby, Meredith [1 ]
Wang, Susana [1 ]
Peluso, Paul [1 ]
Sebra, Robert [1 ]
Sorenson, Jon [1 ]
Bullard, James [1 ]
Yen, Jackie [1 ]
Valdovino, Marie [1 ]
Mollova, Emilia [1 ]
Luong, Khai [1 ]
Lin, Steven [1 ]
Lamay, Brianna [1 ]
Joshi, Amruta [1 ]
Rowe, Lori [4 ]
Frace, Michael [4 ]
Tarr, Cheryl L. [4 ]
Turnsek, Maryann [4 ]
Davis, Brigid M. [5 ,6 ]
Kasarskis, Andrew [1 ]
Mekalanos, John J. [3 ]
Waldor, Matthew K. [3 ,5 ,6 ]
Schadt, Eric E. [1 ,2 ]
机构
[1] Pacific Biosci, Menlo Pk, CA USA
[2] Mt Sinai Sch Med, Dept Genet & Genom Sci, New York, NY USA
[3] Harvard Univ, Sch Med, Dept Microbiol & Mol Genet, Boston, MA 02115 USA
[4] Ctr Dis Control & Prevent, Natl Ctr Emerging & Zoonot Infect Dis, Atlanta, GA USA
[5] Harvard Univ, Sch Med, Dept Med, Boston, MA USA
[6] Howard Hughes Med Inst, Boston, MA 02115 USA
基金
美国国家卫生研究院;
关键词
READ SEQUENCE DATA; VIBRIO-CHOLERAE; STRUCTURAL VARIATION; ORIGIN; GENERATION; INTEGRONS; ASSEMBLER; OUTBREAK; STRAIN; HAITI;
D O I
10.1038/nbt.2288
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Advances in DNA sequencing technology have improved our ability to characterize most genomic diversity. However, accurate resolution of large structural events is challenging because of the short read lengths of second-generation technologies. Third-generation sequencing technologies, which can yield longer multikilobase reads, have the potential to address limitations associated with genome assembly. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at >99.9% accuracy. Complex regions with clinically relevant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 cholera reference strain, we obtained 14 scaffolds of greater than 1 kb for the experimental data and 8 scaffolds of greater than 1 kb for the simulated data, which allowed us to correct several errors in contigs assembled from the short-read data alone. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly.
引用
收藏
页码:701 / +
页数:9
相关论文
共 49 条
  • [1] Recent Clonal Origin of Cholera in Haiti
    Ali, Afsar
    Chen, Yuansha
    Johnson, Judith A.
    Redden, Edsel
    Mayette, Yfto
    Rashid, Mohammed H.
    Stine, O. Colin
    Morris, J. Glenn, Jr.
    [J]. EMERGING INFECTIOUS DISEASES, 2011, 17 (04) : 699 - 701
  • [2] APPLICATIONS OF NEXT-GENERATION SEQUENCING Genome structural variation discovery and genotyping
    Alkan, Can
    Coe, Bradley P.
    Eichler, Evan E.
    [J]. NATURE REVIEWS GENETICS, 2011, 12 (05) : 363 - 375
  • [3] Limitations of next-generation genome sequence assembly
    Alkan, Can
    Sajjadian, Saba
    Eichler, Evan E.
    [J]. NATURE METHODS, 2011, 8 (01) : 61 - 65
  • [4] Batzoglou S, 2002, GENOME RES, V12, P177, DOI 10.1101/gr.208902
  • [5] ALLPATHS: De novo assembly of whole-genome shotgun microreads
    Butler, Jonathan
    MacCallum, Iain
    Kleber, Michael
    Shlyakhter, Ilya A.
    Belmonte, Matthew K.
    Lander, Eric S.
    Nusbaum, Chad
    Jaffe, David B.
    [J]. GENOME RESEARCH, 2008, 18 (05) : 810 - 820
  • [6] Genome Project Standards in a New Era of Sequencing
    Chain, P. S. G.
    Grafham, D. V.
    Fulton, R. S.
    FitzGerald, M. G.
    Hostetler, J.
    Muzny, D.
    Ali, J.
    Birren, B.
    Bruce, D. C.
    Buhay, C.
    Cole, J. R.
    Ding, Y.
    Dugan, S.
    Field, D.
    Garrity, G. M.
    Gibbs, R.
    Graves, T.
    Han, C. S.
    Harrison, S. H.
    Highlander, S.
    Hugenholtz, P.
    Khouri, H. M.
    Kodira, C. D.
    Kolker, E.
    Kyrpides, N. C.
    Lang, D.
    Lapidus, A.
    Malfatti, S. A.
    Markowitz, V.
    Metha, T.
    Nelson, K. E.
    Parkhill, J.
    Pitluck, S.
    Qin, X.
    Read, T. D.
    Schmutz, J.
    Sozhamannan, S.
    Sterk, P.
    Strausberg, R. L.
    Sutton, G.
    Thomson, N. R.
    Tiedje, J. M.
    Weinstock, G.
    Wollam, A.
    Detter, J. C.
    [J]. SCIENCE, 2009, 326 (5950) : 236 - 237
  • [7] Fragment assembly with short reads
    Chaisson, M
    Pevzner, P
    Tang, HX
    [J]. BIOINFORMATICS, 2004, 20 (13) : 2067 - 2074
  • [8] Short read fragment assembly of bacterial genomes
    Chaisson, Mark J.
    Pevzner, Pavel A.
    [J]. GENOME RESEARCH, 2008, 18 (02) : 324 - 330
  • [9] The Origin of the Haitian Cholera Outbreak Strain.
    Chin, Chen-Shan
    Sorenson, Jon
    Harris, Jason B.
    Robins, William P.
    Charles, Richelle C.
    Jean-Charles, Roger R.
    Bullard, James
    Webster, Dale R.
    Kasarskis, Andrew
    Peluso, Paul
    Paxinos, Ellen E.
    Yamaichi, Yoshiharu
    Calderwood, Stephen B.
    Mekalanos, John J.
    Schadt, Eric E.
    Waldor, Matthew K.
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2011, 364 (01) : 33 - 42
  • [10] CTXφ contains a hybrid genome derived from tandemly integrated elements
    Davis, BM
    Waldor, MK
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (15) : 8572 - 8577