Intersect-then-combine approach: improving the performance of somatic variant calling in whole exome sequencing data using multiple aligners and callers

被引:42
作者
Callari, Maurizio [1 ]
Sammut, Stephen-John [1 ]
De Mattos-Arruda, Leticia [1 ]
Bruna, Alejandra [1 ]
Rueda, Oscar M. [1 ]
Chin, Suet-Feung [1 ]
Caldas, Carlos [1 ]
机构
[1] Univ Cambridge, CRUK Cambridge Inst, Cambridge, England
关键词
Somatic mutation; Variant calling; Whole exome sequencing; NA12878; Platinum genome; Mutect2; Strelka; BWA; Novoalign; Filtering; CANCER; DISCOVERY; GENOMICS; DNA;
D O I
10.1186/s13073-017-0425-1
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Bioinformatic analysis of genomic sequencing data to identify somatic mutations in cancer samples is far from achieving the required robustness and standardisation. In this study we generated a whole exome sequencing benchmark dataset using the platinum genome sample NA12878 and developed an intersect-then-combine (ITC) approach to increase the accuracy in calling single nucleotide variants (SNVs) and indels in tumour-normal pairs. We evaluated the effect of alignment, base quality recalibration, mutation caller and filtering on sensitivity and false positive rate. The ITC approach increased the sensitivity up to 17.1%, without increasing the false positive rate per megabase (FPR/Mb) and its validity was confirmed in a set of clinical samples.
引用
收藏
页数:11
相关论文
共 21 条
[1]   A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing [J].
Alioto, Tyler S. ;
Buchhalter, Ivo ;
Derdak, Sophia ;
Hutter, Barbara ;
Eldridge, Matthew D. ;
Hovig, Eivind ;
Heisler, Lawrence E. ;
Beck, Timothy A. ;
Simpson, Jared T. ;
Tonon, Laurie ;
Sertier, Anne-Sophie ;
Patch, Ann-Marie ;
Jaeger, Natalie ;
Ginsbach, Philip ;
Drews, Ruben ;
Paramasivam, Nagarajan ;
Kabbe, Rolf ;
Chotewutmontri, Sasithorn ;
Diessl, Nicolle ;
Previti, Christopher ;
Schmidt, Sabine ;
Brors, Benedikt ;
Feuerbach, Lars ;
Heinold, Michael ;
Groebner, Susanne ;
Korshunov, Andrey ;
Tarpey, Patrick S. ;
Butler, Adam P. ;
Hinton, Jonathan ;
Jones, David ;
Menzies, Andrew ;
Raine, Keiran ;
Shepherd, Rebecca ;
Stebbings, Lucy ;
Teague, Jon W. ;
Ribeca, Paolo ;
Giner, Francesc Castro ;
Beltran, Sergi ;
Raineri, Emanuele ;
Dabad, Marc ;
Heath, Simon C. ;
Gut, Marta ;
Denroche, Robert E. ;
Harding, Nicholas J. ;
Yamaguchi, Takafumi N. ;
Fujimoto, Akihiro ;
Nakagawa, Hidewaki ;
Quesada, Ctor ;
Valdes-Mas, Rafael ;
Nakken, Sigve .
NATURE COMMUNICATIONS, 2015, 6
[2]   RPS6KA2, a putative tumour suppressor gene at 6q27 in sporadic epithelial ovarian cancer [J].
Bignone, P. A. ;
Lee, K. Y. ;
Liu, Y. ;
Emilion, G. ;
Finch, J. ;
Soosay, A. E. R. ;
Charnock, F. M. L. ;
Beck, S. ;
Dunham, I. ;
Mungall, A. J. ;
Ganesan, T. S. .
ONCOGENE, 2007, 26 (05) :683-700
[3]   A Biobank of Breast Cancer Explants with Preserved Intra-tumor Heterogeneity to Screen Anticancer Compounds [J].
Bruna, Alejandra ;
Rueda, Oscar M. ;
Greenwood, Wendy ;
Batra, Ankita Sati ;
Callari, Maurizio ;
Batra, Rajbir Nath ;
Pogrebniak, Katherine ;
Sandoval, Jose ;
Cassidy, John W. ;
Tufegdzic-Vidakovic, Ana ;
Sammut, Stephen-John ;
Jones, Linda ;
Provenzano, Elena ;
Baird, Richard ;
Eirew, Peter ;
Hadfield, James ;
Eldridge, Matthew ;
McLaren-Douglas, Anne ;
Barthorpe, Andrew ;
Lightfoot, Howard ;
O'Connor, Mark J. ;
Gray, Joe ;
Cortes, Javier ;
Baselga, Jose ;
Marangoni, Elisabetta ;
Welm, Alana L. ;
Aparicio, Samuel ;
Serra, Violeta ;
Garnett, Mathew J. ;
Caldas, Carlos .
CELL, 2016, 167 (01) :260-+
[4]   Discovery and characterization of artifactual mutations in deep coverage targeted capture sequencing data due to oxidative DNA damage during sample preparation [J].
Costello, Maura ;
Pugh, Trevor J. ;
Fennell, Timothy J. ;
Stewart, Chip ;
Lichtenstein, Lee ;
Meldrim, James C. ;
Fostel, Jennifer L. ;
Friedrich, Dennis C. ;
Perrin, Danielle ;
Dionne, Danielle ;
Kim, Sharon ;
Gabriel, Stacey B. ;
Lander, Eric S. ;
Fisher, Sheila ;
Getz, Gad .
NUCLEIC ACIDS RESEARCH, 2013, 41 (06) :e67
[5]   Cerebrospinal fluid-derived circulating tumour DNA better represents the genomic alterations of brain tumours than plasma [J].
De Mattos-Arruda, Leticia ;
Mayor, Regina ;
Ng, Charlotte K. Y. ;
Weigelt, Britta ;
Martinez-Ricarte, Francisco ;
Torrejon, Davis ;
Oliveira, Mafalda ;
Arias, Alexandra ;
Raventos, Carolina ;
Tang, Jiabin ;
Guerini-Rocco, Elena ;
Martinez-Saez, Elena ;
Lois, Sergio ;
Marin, Oscar ;
de la Cruz, Xavier ;
Piscuoglio, Salvatore ;
Towers, Russel ;
Vivancos, Ana ;
Peg, Vicente ;
Ramon y Cajal, Santiago ;
Carles, Joan ;
Rodon, Jordi ;
Gonzalez-Cao, Maria ;
Tabernero, Josep ;
Felip, Enriqueta ;
Sahuquillo, Joan ;
Berger, Michael F. ;
Cortes, Javier ;
Reis-Filho, Jorge S. ;
Seoane, Joan .
NATURE COMMUNICATIONS, 2015, 6
[6]   A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree [J].
Eberle, Michael A. ;
Fritzilas, Epameinondas ;
Krusche, Peter ;
Kallberg, Morten ;
Moore, Benjamin L. ;
Bekritsky, Mitchell A. ;
Iqbal, Zamin ;
Chuang, Han-Yu ;
Humphray, Sean J. ;
Halpern, Aaron L. ;
Kruglyak, Semyon ;
Margulies, Elliott H. ;
McVean, Gil ;
Bentley, David R. .
GENOME RESEARCH, 2017, 27 (01) :157-164
[7]  
Ewing AD, 2015, NAT METHODS, V12, P623, DOI [10.1038/NMETH.3407, 10.1038/nmeth.3407]
[8]   An analytical framework for optimizing variant discovery from personal genomes [J].
Highnam, Gareth ;
Wang, Jason J. ;
Kusler, Dean ;
Zook, Justin ;
Vijayan, Vinaya ;
Leibovich, Nir ;
Mittelman, David .
NATURE COMMUNICATIONS, 2015, 6
[9]   Comparing somatic mutation-callers: beyond Venn diagrams [J].
Kim, Su Yeon ;
Speed, Terence P. .
BMC BIOINFORMATICS, 2013, 14
[10]   The Next-Generation Sequencing Revolution and Its Impact on Genomics [J].
Koboldt, Daniel C. ;
Steinberg, Karyn Meltz ;
Larson, David E. ;
Wilson, Richard K. ;
Mardis, Elaine R. .
CELL, 2013, 155 (01) :27-38