NGSPE: A pipeline for end-to-end analysis of DNA sequencing data and comparison between different platforms

被引:1
作者
Huang, Ke [1 ]
Yellapantula, Venkata [1 ,2 ]
Baier, Leslie [1 ]
Dinu, Valentin [1 ,2 ]
机构
[1] Diabetes Molecular Genetics Section, PECRB, NIDDK, National Institutes of Health, Phoenix, AZ
[2] Department of Biomedical Informatics, Arizona State University, Phoenix, AZ
基金
美国国家卫生研究院;
关键词
Alignment; Annotation; Data analysis; DNA; Genotype calling; Next generation sequencing;
D O I
10.1016/j.compbiomed.2013.05.025
中图分类号
学科分类号
摘要
We present NGSPE, a pipeline for variation discovery and genotyping of pair-ended Illumina next generation sequencing (NGS) data (http://ngspeanalysis.sourceforge.net/). This pipeline not only describes a set of sequential analytical steps, such as short reads alignment, genotype calling and functional variation annotation that can be conducted using open-source software tools, but also provides users a set of scripts to install the dependent software and resources and implement the pipeline on their data. A sample summary report including the concordance rate between data generated by this pipeline and different resources as well as the comparison between replication samples of two commercial platforms from Illumina and Complete Genomics is also provided. Furthermore, some of the mutations identified by the pipeline were verified using Sanger sequencing. © 2013.
引用
收藏
页码:1171 / 1176
页数:5
相关论文
共 13 条
[1]  
Choi M., Scholl U.I., Ji W., Liu T., Tikhonova I.R., Zumbo P., Nayir A., Bakkaloglu A., Ozen S., Sanjad S., Nelson-Williams C., Farhi A., Mane S., Lifton R.P., Genetic diagnosis by whole exome capture and massively parallel DNA sequencing, Proc. Nat. Acad. Sci. U.S.A., 106, pp. 19096-19101, (2009)
[2]  
Lupski J.R., Reid J.G., Gonzaga-Jauregui C., Rio Deiros D., Chen D.C., Nazareth L., Bainbridge M., Dinh H., Jing C., Wheeler D.A., McGuire A.L., Zhang F., Stankiewicz P., Halperin J.J., Yang C., Gehman C., Guo D., Irikat R.K., Tom W., Fantin N.J., Muzny D.M., Gibbs R.A., Whole-genome sequencing in a patient with Charcot-Marie-Tooth neuropathy, N. Engl. J. Med., 362, pp. 1181-1191, (2010)
[3]  
Ng S.B., Buckingham K.J., Lee C., Bigham A.W., Tabor H.K., Dent K.M., Huff C.D., Shannon P.T., Jabs E.W., Nickerson D.A., Shendure J., Bamshad M.J., Exome sequencing identifies the cause of a mendelian disorder, Nat. Genet., 42, pp. 30-35, (2010)
[4]  
Roach J.C., Glusman G., Smit A.F., Huff C.D., Hubley R., Shannon P.T., Rowen L., Pant K.P., Goodman N., Bamshad M., Shendure J., Drmanac R., Jorde L.B., Hood L., Galas D.J., Analysis of genetic inheritance in a family quartet by whole-genome sequencing, Science, 328, pp. 636-639, (2010)
[5]  
Yi X., Liang Y., Huerta-Sanchez E., Jin X., Cuo Z.X., Pool J.E., Xu X., Jiang H., Vinckenbosch N., Korneliussen T.S., Zheng H., Liu T., He W., Li K., Luo R., Nie X., Wu H., Zhao M., Cao H., Zou J., Shan Y., Li S., Yang Q., Ni A.P., Tian G., Xu J., Liu X., Jiang T., Wu R., Zhou G., Tang M., Qin J., Wang T., Feng S., Huasang G., Luosang J., Wang W., Chen F., Wang Y., Zheng X., Li Z., Bianba Z., Yang G., Wang X., Tang S., Gao G., Chen Y., Luo Z., Gusang L., Cao Z., Zhang Q., Ouyang W., Ren X., Lia
[6]  
DePristo M.A., Banks E., Poplin R., Garimella K.V., Maguire J.R., Hartl C., Philippakis A.A., del Angel G., Rivas M.A., Hanna M., McKenna A., Fennell T.J., Kernytsky A.M., Sivachenko A.Y., Cibulskis K., Gabriel S.B., Altshuler D., Daly M.J., A framework for variation discovery and genotyping using next-generation DNA sequencing data, Nat. Genet., 43, pp. 491-498, (2011)
[7]  
Li H., Durbin R., Fast and accurate short read alignment with Burrows-Wheeler transform, Bioinformatics, 25, pp. 1754-1760, (2009)
[8]  
McKenna A., Hanna M., Banks E., Sivachenko A., Cibulskis K., Kernytsky A., Garimella K., Altshuler D., Gabriel S., Daly M., DePristo M.A., The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data, Genome. Res., 20, pp. 1297-1303, (2010)
[9]  
Quinlan A.R., Hall I.M., BEDTools: a flexible suite of utilities for comparing genomic features, Bioinformatics, 26, pp. 841-842, (2010)
[10]  
Wang K., Li M., Hakonarson H., ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data, Nucleic Acids Res., 38, (2010)