inGAP: an integrated next-generation genome analysis pipeline

被引:47
作者
Qi, Ji [1 ]
Zhao, Fangqing [1 ]
Buboltz, Anne [1 ]
Schuster, Stephan C. [1 ]
机构
[1] Penn State Univ, Ctr Comparat Genom & Bioinformat, University Pk, PA 16802 USA
关键词
HIGH-THROUGHPUT; ALIGNMENT; TOOL;
D O I
10.1093/bioinformatics/btp615
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We develop a novel mining pipeline, Integrative Next-generation Genome Analysis Pipeline (inGAP), guided by a Bayesian principle to detect single nucleotide polymorphisms (SNPs), insertion/deletions (indels) by comparing high-throughput pyrosequencing reads with a reference genome of related organisms. inGAP can be applied to the mapping of both Roche/454 and Illumina reads with no restriction of read length. Experiments on simulated and experimental data show that this pipeline can achieve overall 97% accuracy in SNP detection and 94% in the finding of indels. All the detected SNPs/indels can be further evaluated by a graphical editor in our pipeline. inGAP also provides functions of multiple genomes comparison and assistance of bacterial genome assembly.
引用
收藏
页码:127 / 129
页数:3
相关论文
共 13 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]   Quality scores and SNP detection in sequencing-by-synthesis systems [J].
Brockman, William ;
Alvarez, Pablo ;
Young, Sarah ;
Garber, Manuel ;
Giannoukos, Georgia ;
Lee, William L. ;
Russ, Carsten ;
Lander, Eric S. ;
Nusbaum, Chad ;
Jaffe, David B. .
GENOME RESEARCH, 2008, 18 (05) :763-770
[3]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[4]   High-throughput sequencing provides insights into genome variation and evolution in Salmonella Typhi [J].
Holt, Kathryn E. ;
Parkhill, Julian ;
Mazzoni, Camila J. ;
Roumagnac, Philippe ;
Weill, Francois-Xavier ;
Goodhead, Ian ;
Rance, Richard ;
Baker, Stephen ;
Maskell, Duncan J. ;
Wain, John ;
Dolecek, Christiane ;
Achtman, Mark ;
Dougan, Gordon .
NATURE GENETICS, 2008, 40 (08) :987-993
[5]  
Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202, 10.1101/gr.229202. Article published online before March 2002]
[6]   Mapping short DNA sequencing reads and calling variants using mapping quality scores [J].
Li, Heng ;
Ruan, Jue ;
Durbin, Richard .
GENOME RESEARCH, 2008, 18 (11) :1851-1858
[7]   Fast and accurate short read alignment with Burrows-Wheeler transform [J].
Li, Heng ;
Durbin, Richard .
BIOINFORMATICS, 2009, 25 (14) :1754-1760
[8]   SOAP: short oligonucleotide alignment program [J].
Li, Ruiqiang ;
Li, Yingrui ;
Kristiansen, Karsten ;
Wang, Jun .
BIOINFORMATICS, 2008, 24 (05) :713-714
[9]   Next-generation DNA sequencing methods [J].
Mardis, Elaine R. .
ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, 2008, 9 :387-402
[10]   The impact of next-generation sequencing technology on genetics [J].
Mardis, Elaine R. .
TRENDS IN GENETICS, 2008, 24 (03) :133-141