De novo assembly of human genomes with massively parallel short read sequencing

被引:2257
|
作者
Li, Ruiqiang [1 ,2 ]
Zhu, Hongmei [1 ]
Ruan, Jue [1 ]
Qian, Wubin [1 ]
Fang, Xiaodong [1 ]
Shi, Zhongbin [1 ]
Li, Yingrui [1 ]
Li, Shengting [1 ]
Shan, Gao [1 ]
Kristiansen, Karsten [1 ,2 ]
Li, Songgang [1 ]
Yang, Huanming [1 ]
Wang, Jian [1 ]
Wang, Jun [1 ,2 ]
机构
[1] Beijing Genom Inst Shenzhen, Shenzhen 518083, Peoples R China
[2] Univ Copenhagen, Dept Biol, DK-2200 Copenhagen, Denmark
基金
中国国家自然科学基金;
关键词
SHORT DNA-SEQUENCES; ALIGNMENT; MILLIONS; PROGRAM;
D O I
10.1101/gr.097261.109
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Next-generation massively parallel DNA sequencing technologies provide ultrahigh throughput at a substantially lower unit data cost; however, the data are very short read length sequences, making de novo assembly extremely challenging. Here, we describe a novel method for de novo assembly of large genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.
引用
收藏
页码:265 / 272
页数:8
相关论文
共 50 条
  • [1] Assemblathon 1: A competitive assessment of de novo short read assembly methods
    Earl, Dent
    Bradnam, Keith
    St John, John
    Darling, Aaron
    Lin, Dawei
    Fass, Joseph
    Hung On Ken Yu
    Buffalo, Vince
    Zerbino, Daniel R.
    Diekhans, Mark
    Ngan Nguyen
    Ariyaratne, Pramila Nuwantha
    Sung, Wing-Kin
    Ning, Zemin
    Haimel, Matthias
    Simpson, Jared T.
    Fonseca, Nuno A.
    Birol, Inanc
    Docking, T. Roderick
    Ho, Isaac Y.
    Rokhsar, Daniel S.
    Chikhi, Rayan
    Lavenier, Dominique
    Chapuis, Guillaume
    Naquin, Delphine
    Maillet, Nicolas
    Schatz, Michael C.
    Kelley, David R.
    Phillippy, Adam M.
    Koren, Sergey
    Yang, Shiaw-Pyng
    Wu, Wei
    Chou, Wen-Chi
    Srivastava, Anuj
    Shaw, Timothy I.
    Ruby, J. Graham
    Skewes-Cox, Peter
    Betegon, Miguel
    Dimon, Michelle T.
    Solovyev, Victor
    Seledtsov, Igor
    Kosarev, Petr
    Vorobyev, Denis
    Ramirez-Gonzalez, Ricardo
    Leggett, Richard
    MacLean, Dan
    Xia, Fangfang
    Luo, Ruibang
    Li, Zhenyu
    Xie, Yinlong
    GENOME RESEARCH, 2011, 21 (12) : 2224 - 2241
  • [2] De Novo Short Read Assembly Algorithm with Low Memory Usage
    Endo, Yuki
    Toyama, Fubito
    Chiba, Chikafumi
    Mori, Hiroshi
    Shoji, Kenji
    BIOINFORMATICS 2014: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIOINFORMATICS MODELS, METHODS AND ALGORITHMS, 2014, : 215 - 220
  • [3] Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes
    Shafin, Kishwar
    Pesout, Trevor
    Lorig-Roach, Ryan
    Haukness, Marina
    Olsen, Hugh E.
    Bosworth, Colleen
    Armstrong, Joel
    Tigyi, Kristof
    Maurer, Nicholas
    Koren, Sergey
    Sedlazeck, Fritz J.
    Marschall, Tobias
    Mayes, Simon
    Costa, Vania
    Zook, Justin M.
    Liu, Kelvin J.
    Kilburn, Duncan
    Sorensen, Melanie
    Munson, Katy M.
    Vollger, Mitchell R.
    Monlong, Jean
    Garrison, Erik
    Eichler, Evan E.
    Salama, Sofie
    Haussler, David
    Green, Richard E.
    Akeson, Mark
    Phillippy, Adam
    Miga, Karen H.
    Carnevali, Paolo
    Jain, Miten
    Paten, Benedict
    NATURE BIOTECHNOLOGY, 2020, 38 (09) : 1044 - +
  • [4] Long-read sequencing and de novo assembly of a Chinese genome
    Shi, Lingling
    Guo, Yunfei
    Dong, Chengliang
    Huddleston, John
    Yang, Hui
    Han, Xiaolu
    Fu, Aisi
    Li, Quan
    Li, Na
    Gong, Siyi
    Lintner, Katherine E.
    Ding, Qiong
    Wang, Zou
    Hu, Jiang
    Wang, Depeng
    Wang, Feng
    Wang, Lin
    Lyon, Gholson J.
    Guan, Yongtao
    Shen, Yufeng
    Evgrafov, Oleg V.
    Knowles, James A.
    Thibaud-Nissen, Francoise
    Schneider, Valerie
    Yu, Chack-Yung
    Zhou, Libing
    Eichler, Evan E.
    So, Kwok-Fai
    Wang, Kai
    NATURE COMMUNICATIONS, 2016, 7
  • [5] Assembler for de novo assembly of large genomes
    Chu, Te-Chin
    Lu, Chen-Hua
    Liu, Tsunglin
    Lee, Greg C.
    Li, Wen-Hsiung
    Shih, Arthur Chun-Chieh
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (36) : E3417 - E3424
  • [6] Evaluating the Fidelity of De Novo Short Read Metagenomic Assembly Using Simulated Data
    Pignatelli, Miguel
    Moya, Andres
    PLOS ONE, 2011, 6 (05):
  • [7] HGA: de novo genome assembly method for bacterial genomes using high coverage short sequencing reads
    Al-okaily, Anas A.
    BMC GENOMICS, 2016, 17
  • [8] Long-read sequencing and de novo genome assembly of marine medaka (Oryzias melastigma)
    Liang, Pingping
    Saqib, Hafiz Sohaib Ahmed
    Ni, Xiaomin
    Shen, Yingjia
    BMC GENOMICS, 2020, 21 (01)
  • [9] Sequencing and de novo assembly of 150 genomes from Denmark as a population reference
    Maretty, Lasse
    Jensen, Jacob Malte
    Petersen, Bent
    Sibbesen, Jonas A. Ndreas
    Liu, Siyang
    Villesen, Palle
    Kov, Laurits S.
    Belling, Kirstine
    Have, Christian T. Heil
    Izarzugaza, Jose M. G.
    Rosjean, Marie G.
    Bork-Jensen, Jette
    Rove, Jakob G.
    Ls, Thomas D. A.
    Huang, Shujia
    Chang, Yuqi
    Xu, Ruiqi
    Ye, Weijian
    Rao, Junhua
    Guo, Xiaosen
    Sun, Jihua
    Cao, Hongzhi
    Ye, Chen
    van Beusekom, Johan
    Espeseth, Thomas
    Flindt, Esben
    Friborg, Rune M.
    Halager, Anders E.
    Le Hellard, Stephanie
    Hultman, Christina M.
    Lescai, Francesco
    Li, Shengting
    Lund, Ole
    Longren, Peter
    Mailund, Thomas
    Matey-Hernandez, Maria Luisa
    Mors, Ole
    Pedersen, Christian N. S.
    Icheritz-Ponten, Thomas S.
    Sullivan, Patrick
    Syed, Ali
    Westergaard, David
    Yadav, Rachita
    Li, Ning
    Xu, Xun
    Hansen, Torben
    Krogh, Anders
    Bolund, Lars
    Sorensen, Thorkild I. A.
    Pedersen, Oluf
    NATURE, 2017, 548 (7665) : 87 - +
  • [10] Parallelized short read assembly of large genomes using de Bruijn graphs
    Liu, Yongchao
    Schmidt, Bertil
    Maskell, Douglas L.
    BMC BIOINFORMATICS, 2011, 12