Comparative studies of de novo assembly tools for next-generation sequencing technologies

被引:81
作者
Lin, Yong [1 ,2 ]
Li, Jian [3 ]
Shen, Hui [3 ]
Zhang, Lei [1 ,2 ]
Papasian, Christopher J. [2 ]
Deng, Hong-Wen [1 ,2 ,3 ]
机构
[1] Shanghai Univ Sci & Technol, Ctr Syst Biomed Sci, Shanghai 200093, Peoples R China
[2] Univ Missouri, Sch Med, Kansas City, MO 64108 USA
[3] Tulane Univ, Sch Publ Hlth & Trop Med, Dept Biostat & Bioinformat, New Orleans, LA 70112 USA
关键词
SHORT DNA-SEQUENCES; MILLIONS; GENOMES; READS; ERROR;
D O I
10.1093/bioinformatics/btr319
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Several new de novo assembly tools have been developed recently to assemble short sequencing reads generated by next-generation sequencing platforms. However, the performance of these tools under various conditions has not been fully investigated, and sufficient information is not currently available for informed decisions to be made regarding the tool that would be most likely to produce the best performance under a specific set of conditions. Results: We studied and compared the performance of commonly used de novo assembly tools specifically designed for next-generation sequencing data, including SSAKE, VCAKE, Euler-sr, Edena, Velvet, ABySS and SOAPdenovo. Tools were compared using several performance criteria, including N50 length, sequence coverage and assembly accuracy. Various properties of read data, including single-end/paired-end, sequence GC content, depth of coverage and base calling error rates, were investigated for their effects on the performance of different assembly tools. We also compared the computation time and memory usage of these seven tools. Based on the results of our comparison, the relative performance of individual tools are summarized and tentative guidelines for optimal selection of different assembly tools, under different conditions, are provided.
引用
收藏
页码:2031 / 2037
页数:7
相关论文
共 19 条
[1]   Whole-genome re-sequencing [J].
Bentley, David R. .
CURRENT OPINION IN GENETICS & DEVELOPMENT, 2006, 16 (06) :545-552
[2]   Short read fragment assembly of bacterial genomes [J].
Chaisson, Mark J. ;
Pevzner, Pavel A. .
GENOME RESEARCH, 2008, 18 (02) :324-330
[3]   SHARCGS, a fast and highly accurate short-read assembly algorithm for de novo genomic sequencing [J].
Dohm, Juliane C. ;
Lottaz, Claudio ;
Borodina, Tatiana ;
Himmelbauer, Heinz .
GENOME RESEARCH, 2007, 17 (11) :1697-1706
[4]   Base-calling of automated sequencer traces using phred.: II.: Error probabilities [J].
Ewing, B ;
Green, P .
GENOME RESEARCH, 1998, 8 (03) :186-194
[5]   De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer [J].
Hernandez, David ;
Francois, Patrice ;
Farinelli, Laurent ;
Osteras, Magne ;
Schrenzel, Jacques .
GENOME RESEARCH, 2008, 18 (05) :802-809
[6]   Whole-genome sequence assembly for mammalian genomes: Arachne 2 [J].
Jaffe, DB ;
Butler, J ;
Gnerre, S ;
Mauceli, E ;
Lindblad-Toh, K ;
Mesirov, JP ;
Zody, MC ;
Lander, ES .
GENOME RESEARCH, 2003, 13 (01) :91-96
[7]   Extending assembly of short DNA sequences to handle error [J].
Jeck, William R. ;
Reinhardt, Josephine A. ;
Baltrus, David A. ;
Hickenbotham, Matthew T. ;
Magrini, Vincent ;
Mardis, Elaine R. ;
Dangl, Jeffery L. ;
Jones, Corbin D. .
BIOINFORMATICS, 2007, 23 (21) :2942-2944
[8]   Initial sequencing and analysis of the human genome [J].
Lander, ES ;
Int Human Genome Sequencing Consortium ;
Linton, LM ;
Birren, B ;
Nusbaum, C ;
Zody, MC ;
Baldwin, J ;
Devon, K ;
Dewar, K ;
Doyle, M ;
FitzHugh, W ;
Funke, R ;
Gage, D ;
Harris, K ;
Heaford, A ;
Howland, J ;
Kann, L ;
Lehoczky, J ;
LeVine, R ;
McEwan, P ;
McKernan, K ;
Meldrim, J ;
Mesirov, JP ;
Miranda, C ;
Morris, W ;
Naylor, J ;
Raymond, C ;
Rosetti, M ;
Santos, R ;
Sheridan, A ;
Sougnez, C ;
Stange-Thomann, N ;
Stojanovic, N ;
Subramanian, A ;
Wyman, D ;
Rogers, J ;
Sulston, J ;
Ainscough, R ;
Beck, S ;
Bentley, D ;
Burton, J ;
Clee, C ;
Carter, N ;
Coulson, A ;
Deadman, R ;
Deloukas, P ;
Dunham, A ;
Dunham, I ;
Durbin, R ;
French, L .
NATURE, 2001, 409 (6822) :860-921
[9]   De novo assembly of human genomes with massively parallel short read sequencing [J].
Li, Ruiqiang ;
Zhu, Hongmei ;
Ruan, Jue ;
Qian, Wubin ;
Fang, Xiaodong ;
Shi, Zhongbin ;
Li, Yingrui ;
Li, Shengting ;
Shan, Gao ;
Kristiansen, Karsten ;
Li, Songgang ;
Yang, Huanming ;
Wang, Jian ;
Wang, Jun .
GENOME RESEARCH, 2010, 20 (02) :265-272
[10]   Aggressive assembly of pyrosequencing reads with mates [J].
Miller, Jason R. ;
Delcher, Arthur L. ;
Koren, Sergey ;
Venter, Eli ;
Walenz, Brian P. ;
Brownley, Anushka ;
Johnson, Justin ;
Li, Kelvin ;
Mobarry, Clark ;
Sutton, Granger .
BIOINFORMATICS, 2008, 24 (24) :2818-2824