CISA: Contig Integrator for Sequence Assembly of Bacterial Genomes

被引:188
作者
Lin, Shin-Hung [1 ]
Liao, Yu-Chieh [1 ]
机构
[1] Natl Hlth Res Inst, Inst Populat Hlth Sci, Div Biostat & Bioinformat, Zhunan, Taiwan
关键词
ALGORITHMS; READS;
D O I
10.1371/journal.pone.0060843
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
A plethora of algorithmic assemblers have been proposed for the de novo assembly of genomes, however, no individual assembler guarantees the optimal assembly for diverse species. Optimizing various parameters in an assembler is often performed in order to generate the most optimal assembly. However, few efforts have been pursued to take advantage of multiple assemblies to yield an assembly of high accuracy. In this study, we employ various state-of-the-art assemblers to generate different sets of contigs for bacterial genomes. A tool, named CISA, has been developed to integrate the assemblies into a hybrid set of contigs, resulting in assemblies of superior contiguity and accuracy, compared with the assemblies generated by the state-of-the-art assemblers and the hybrid assemblies merged by existing tools. This tool is implemented in Python and requires MUMmer and BLAST+ to be installed on the local machine. The source code of CISA and examples of its use are available at http://sb.nhri.org.tw/CISA/.
引用
收藏
页数:7
相关论文
共 16 条
[1]  
Aerts S, 2011, PLOS ONE, V6
[2]   GAM: Genomic Assemblies Merger A Graph Based Method to Integrate Different Assemblies [J].
Casagrande, Alberto ;
Del Fabbro, Cristian ;
Scalabrin, Simone ;
Policriti, Alberto .
2009 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2009, :321-+
[3]   Mauve Assembly Metrics [J].
Darling, Aaron E. ;
Tritt, Andrew ;
Eisen, Jonathan A. ;
Facciotti, Marc T. .
BIOINFORMATICS, 2011, 27 (19) :2756-2757
[4]   De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer [J].
Hernandez, David ;
Francois, Patrice ;
Farinelli, Laurent ;
Osteras, Magne ;
Schrenzel, Jacques .
GENOME RESEARCH, 2008, 18 (05) :802-809
[5]   A computational genomics pipeline for prokaryotic sequencing projects [J].
Kislyuk, Andrey O. ;
Katz, Lee S. ;
Agrawal, Sonia ;
Hagen, Matthew S. ;
Conley, Andrew B. ;
Jayaraman, Pushkala ;
Nelakuditi, Viswateja ;
Humphrey, Jay C. ;
Sammons, Scott A. ;
Govil, Dhwani ;
Mair, Raydel D. ;
Tatti, Kathleen M. ;
Tondella, Maria L. ;
Harcourt, Brian H. ;
Mayer, Leonard W. ;
Jordan, I. King .
BIOINFORMATICS, 2010, 26 (15) :1819-1826
[6]   Versatile and open software for comparing large genomes [J].
Kurtz, S ;
Phillippy, A ;
Delcher, AL ;
Smoot, M ;
Shumway, M ;
Antonescu, C ;
Salzberg, SL .
GENOME BIOLOGY, 2004, 5 (02)
[7]   De novo assembly of human genomes with massively parallel short read sequencing [J].
Li, Ruiqiang ;
Zhu, Hongmei ;
Ruan, Jue ;
Qian, Wubin ;
Fang, Xiaodong ;
Shi, Zhongbin ;
Li, Yingrui ;
Li, Shengting ;
Shan, Gao ;
Kristiansen, Karsten ;
Li, Songgang ;
Yang, Huanming ;
Wang, Jian ;
Wang, Jun .
GENOME RESEARCH, 2010, 20 (02) :265-272
[8]   Integrating genome assemblies with MAIA [J].
Nijkamp, Jurgen ;
Winterbach, Wynand ;
van den Broek, Marcel ;
Daran, Jean-Marc ;
Reinders, Marcel ;
de Ridder, Dick .
BIOINFORMATICS, 2010, 26 (18) :i433-i439
[9]   De novo assembly of short sequence reads [J].
Paszkiewicz, Konrad ;
Studholme, David J. .
BRIEFINGS IN BIOINFORMATICS, 2010, 11 (05) :457-472
[10]   GAGE: A critical evaluation of genome assemblies and assembly algorithms [J].
Salzberg, Steven L. ;
Phillippy, Adam M. ;
Zimin, Aleksey ;
Puiu, Daniela ;
Magoc, Tanja ;
Koren, Sergey ;
Treangen, Todd J. ;
Schatz, Michael C. ;
Delcher, Arthur L. ;
Roberts, Michael ;
Marcais, Guillaume ;
Pop, Mihai ;
Yorke, James A. .
GENOME RESEARCH, 2012, 22 (03) :557-567