De novo assembly of human genomes with massively parallel short read sequencing

被引:2285
作者
Li, Ruiqiang [1 ,2 ]
Zhu, Hongmei [1 ]
Ruan, Jue [1 ]
Qian, Wubin [1 ]
Fang, Xiaodong [1 ]
Shi, Zhongbin [1 ]
Li, Yingrui [1 ]
Li, Shengting [1 ]
Shan, Gao [1 ]
Kristiansen, Karsten [1 ,2 ]
Li, Songgang [1 ]
Yang, Huanming [1 ]
Wang, Jian [1 ]
Wang, Jun [1 ,2 ]
机构
[1] Beijing Genom Inst Shenzhen, Shenzhen 518083, Peoples R China
[2] Univ Copenhagen, Dept Biol, DK-2200 Copenhagen, Denmark
基金
中国国家自然科学基金;
关键词
SHORT DNA-SEQUENCES; ALIGNMENT; MILLIONS; PROGRAM;
D O I
10.1101/gr.097261.109
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation massively parallel DNA sequencing technologies provide ultrahigh throughput at a substantially lower unit data cost; however, the data are very short read length sequences, making de novo assembly extremely challenging. Here, we describe a novel method for de novo assembly of large genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.
引用
收藏
页码:265 / 272
页数:8
相关论文
共 31 条
[1]  
Batzoglou S, 2002, GENOME RES, V12, P177, DOI 10.1101/gr.208902
[2]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[3]   Whole-genome re-sequencing [J].
Bentley, David R. .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 2006, 16 (06) :545-552
[4]   ALLPATHS: De novo assembly of whole-genome shotgun microreads [J].
Butler, Jonathan ;
MacCallum, Iain ;
Kleber, Michael ;
Shlyakhter, Ilya A. ;
Belmonte, Matthew K. ;
Lander, Eric S. ;
Nusbaum, Chad ;
Jaffe, David B. .
GENOME RESEARCH, 2008, 18 (05) :810-820
[5]   Short read fragment assembly of bacterial genomes [J].
Chaisson, Mark J. ;
Pevzner, Pavel A. .
GENOME RESEARCH, 2008, 18 (02) :324-330
[6]   SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing [J].
Dohm, Juliane C. ;
Lottaz, Claudio ;
Borodina, Tatiana ;
Himmelbauer, Heinz .
GENOME RESEARCH, 2007, 17 (11) :1697-1706
[7]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[8]   Single-molecule DNA sequencing of a viral genome [J].
Harris, Timothy D. ;
Buzby, Phillip R. ;
Babcock, Hazen ;
Beer, Eric ;
Bowers, Jayson ;
Braslavsky, Ido ;
Causey, Marie ;
Colonell, Jennifer ;
DiMeo, James ;
Efcavitch, J. William ;
Giladi, Eldar ;
Gill, Jaime ;
Healy, John ;
Jarosz, Mirna ;
Lapen, Dan ;
Moulton, Keith ;
Quake, Stephen R. ;
Steinmann, Kathleen ;
Thayer, Edward ;
Tyurina, Anastasia ;
Ward, Rebecca ;
Weiss, Howard ;
Xie, Zheng .
SCIENCE, 2008, 320 (5872) :106-109
[9]   The atlas genome assembly system [J].
Havlak, P ;
Chen, R ;
Durbin, KJ ;
Egan, A ;
Ren, YR ;
Song, XZ ;
Weinstock, GM ;
Gibbs, RA .
GENOME RESEARCH, 2004, 14 (04) :721-732
[10]   De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer [J].
Hernandez, David ;
Francois, Patrice ;
Farinelli, Laurent ;
Osteras, Magne ;
Schrenzel, Jacques .
GENOME RESEARCH, 2008, 18 (05) :802-809