ExScalibur: A High-Performance Cloud-Enabled Suite for Whole Exome Germline and Somatic Mutation Identification

被引:9
作者
Bao, Riyue [1 ]
Hernandez, Kyle [1 ]
Huang, Lei [1 ]
Kang, Wenjun [1 ]
Bartom, Elizabeth [1 ]
Onel, Kenan [2 ]
Volchenboum, Samuel [1 ,2 ,3 ]
Andrade, Jorge [1 ]
机构
[1] Univ Chicago, Ctr Res Informat, Chicago, IL 60637 USA
[2] Univ Chicago, Dept Pediat, Chicago, IL 60637 USA
[3] Univ Chicago, Computat Inst, Chicago, IL 60637 USA
来源
PLOS ONE | 2015年 / 10卷 / 08期
关键词
VARIANT-CALLING PIPELINES; POINT MUTATIONS; GENOMICS; DISCOVERY; FRAMEWORK; CANCER;
D O I
10.1371/journal.pone.0135800
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Whole exome sequencing has facilitated the discovery of causal genetic variants associated with human diseases at deep coverage and low cost. In particular, the detection of somatic mutations from tumor/normal pairs has provided insights into the cancer genome. Although there is an abundance of publicly-available software for the detection of germline and somatic variants, concordance is generally limited among variant callers and alignment algorithms. Successful integration of variants detected by multiple methods requires in-depth knowledge of the software, access to high-performance computing resources, and advanced programming techniques. We present ExScalibur, a set of fully automated, highly scalable and modulated pipelines for whole exome data analysis. The suite integrates multiple alignment and variant calling algorithms for the accurate detection of germline and somatic mutations with close to 99% sensitivity and specificity. ExScalibur implements streamlined execution of analytical modules, real-time monitoring of pipeline progress, robust handling of errors and intuitive documentation that allows for increased reproducibility and sharing of results and workflows. It runs on local computers, high-performance computing clusters and cloud environments. In addition, we provide a data analysis report utility to facilitate visualization of the results that offers interactive exploration of quality control files, read alignment and variant calls, assisting downstream customization of potential disease-causing mutations. ExScalibur is open-source and is also available as a public image on Amazon cloud.
引用
收藏
页数:13
相关论文
共 44 条
[1]  
Abecasis G.R., 2012, NATURE, V491, P56, DOI DOI 10.1038/nature11632
[2]   A method and server for predicting damaging missense mutations [J].
Adzhubei, Ivan A. ;
Schmidt, Steffen ;
Peshkin, Leonid ;
Ramensky, Vasily E. ;
Gerasimova, Anna ;
Bork, Peer ;
Kondrashov, Alexey S. ;
Sunyaev, Shamil R. .
NATURE METHODS, 2010, 7 (04) :248-249
[3]  
[Anonymous], VCFILB C LIB PARSING
[4]  
[Anonymous], FASTQC QUALITY APPL
[5]  
[Anonymous], PRIMER PARALLELISM G
[6]  
[Anonymous], ARXIV12073907GBIOGN
[7]  
[Anonymous], NHLBI GO EXOM SEQ PR
[8]  
Auwera G A., 2013, Curr. Protoc. Bioinforma., V43, DOI 11.10.33
[9]   Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease [J].
Botstein, D ;
Risch, N .
NATURE GENETICS, 2003, 33 (Suppl 3) :228-237
[10]   An integrative variant analysis suite for whole exome next-generation sequencing data [J].
Challis, Danny ;
Yu, Jin ;
Evani, Uday S. ;
Jackson, Andrew R. ;
Paithankar, Sameer ;
Coarfa, Cristian ;
Milosavljevic, Aleksandar ;
Gibbs, Richard A. ;
Yu, Fuli .
BMC BIOINFORMATICS, 2012, 13