DNAp: A Pipeline for DNA-seq Data Analysis

被引:11
作者
Causey, Jason L. [1 ]
Ashby, Cody [2 ,3 ]
Walker, Karl [4 ]
Wang, Zhiping Paul [5 ]
Yang, Mary [6 ]
Guan, Yuanfang [7 ]
Moore, Jason H. [5 ]
Huang, Xiuzhen [1 ]
机构
[1] Arkansas State Univ, Dept Comp Sci, Jonesboro, AR 72467 USA
[2] Univ Arkansas Med Sci, Dept Biomed Informat, Little Rock, AR 72205 USA
[3] Univ Arkansas Med Sci, Myeloma Inst, Little Rock, AR 72205 USA
[4] Univ Arkansas, Dept Math & Comp Sci, Pine Bluff, AR USA
[5] Univ Penn, Inst Biomed Informat, Philadelphia, PA 19104 USA
[6] Univ Arkansas, Dept Informat Sci, Little Rock, AR 72204 USA
[7] Univ Michigan, Dept Computat Med & Bioinformat, Ann Arbor, MI 48109 USA
基金
美国国家科学基金会;
关键词
VARIANT; FRAMEWORK; GALAXY;
D O I
10.1038/s41598-018-25022-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Next-generation sequencing is empowering genetic disease research. However, it also brings significant challenges for efficient and effective sequencing data analysis. We built a pipeline, called DNAp, for analyzing whole exome sequencing (WES) and whole genome sequencing (WGS) data, to detect mutations from disease samples. The pipeline is containerized, convenient to use and can run under any system, since it is a fully automatic process in Docker container form. It is also open, and can be easily customized with user intervention points, such as for updating reference files and different software or versions. The pipeline has been tested with both human and mouse sequencing datasets, and it has generated mutations results, comparable to published results from these datasets, and reproducible across heterogeneous hardware platforms. The pipeline DNAp, funded by the US Food and Drug Administration (FDA), was developed for analyzing DNA sequencing data of FDA. Here we make DNAp an open source, with the software and documentation available to the public at http://bioinformatics.astate.edu/dna-pipeline/.
引用
收藏
页数:9
相关论文
共 28 条
[1]   A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing [J].
Alioto, Tyler S. ;
Buchhalter, Ivo ;
Derdak, Sophia ;
Hutter, Barbara ;
Eldridge, Matthew D. ;
Hovig, Eivind ;
Heisler, Lawrence E. ;
Beck, Timothy A. ;
Simpson, Jared T. ;
Tonon, Laurie ;
Sertier, Anne-Sophie ;
Patch, Ann-Marie ;
Jaeger, Natalie ;
Ginsbach, Philip ;
Drews, Ruben ;
Paramasivam, Nagarajan ;
Kabbe, Rolf ;
Chotewutmontri, Sasithorn ;
Diessl, Nicolle ;
Previti, Christopher ;
Schmidt, Sabine ;
Brors, Benedikt ;
Feuerbach, Lars ;
Heinold, Michael ;
Groebner, Susanne ;
Korshunov, Andrey ;
Tarpey, Patrick S. ;
Butler, Adam P. ;
Hinton, Jonathan ;
Jones, David ;
Menzies, Andrew ;
Raine, Keiran ;
Shepherd, Rebecca ;
Stebbings, Lucy ;
Teague, Jon W. ;
Ribeca, Paolo ;
Giner, Francesc Castro ;
Beltran, Sergi ;
Raineri, Emanuele ;
Dabad, Marc ;
Heath, Simon C. ;
Gut, Marta ;
Denroche, Robert E. ;
Harding, Nicholas J. ;
Yamaguchi, Takafumi N. ;
Fujimoto, Akihiro ;
Nakagawa, Hidewaki ;
Quesada, Ctor ;
Valdes-Mas, Rafael ;
Nakken, Sigve .
NATURE COMMUNICATIONS, 2015, 6
[2]  
[Anonymous], APACHE 2 0
[3]  
[Anonymous], GPL V3
[4]  
[Anonymous], FASTQ DATA HIGH CONF
[5]  
[Anonymous], BCBIO NEXTGEN
[6]  
[Anonymous], SCI REPORTS
[7]   ExScalibur: A High-Performance Cloud-Enabled Suite for Whole Exome Germline and Somatic Mutation Identification [J].
Bao, Riyue ;
Hernandez, Kyle ;
Huang, Lei ;
Kang, Wenjun ;
Bartom, Elizabeth ;
Onel, Kenan ;
Volchenboum, Samuel ;
Andrade, Jorge .
PLOS ONE, 2015, 10 (08)
[8]   An open access pilot freely sharing cancer genomic data from participants in Texas [J].
Becnel, Lauren B. ;
Pereira, Stacey ;
Drummond, Jennifer A. ;
Gingras, Marie-Claude ;
Covington, Kyle R. ;
Kovar, Christie L. ;
Doddapaneni, Harsha Vardhan ;
Hu, Jianhong ;
Muzny, Donna ;
McGuire, Amy L. ;
Wheeler, David A. ;
Gibbs, Richard A. .
SCIENTIFIC DATA, 2016, 3
[9]  
Blankenberg Daniel, 2010, Curr Protoc Mol Biol, VChapter 19, DOI 10.1002/0471142727.mb1910s89
[10]  
Boettiger Carl, 2015, ACM SIGOPS Operating Systems Review, V49, P71