In-depth comparison of somatic point mutation callers based on different tumor next-generation sequencing depth data

被引:77
|
作者
Cai, Lei [1 ,2 ]
Yuan, Wei [1 ]
Zhang, Zhou [1 ,3 ]
He, Lin [1 ,4 ]
Chou, Kuo-Chen [2 ,5 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai Key Lab Psychot Disorders 13dz2260500, Key Lab Genet Dev & Neuropsychiat Disorders, Bio X Inst,Minist Educ, Shanghai 200030, Peoples R China
[2] Gordon Life Sci Inst, Boston, MA 02478 USA
[3] Shanghai Jiao Tong Univ, Sch Med, Inst Biliary Tract Dis, Xinhua Hosp, Shanghai 200092, Peoples R China
[4] Zhejiang Univ, Sch Med, Womens Hosp, Hangzhou 310006, Zhejiang, Peoples R China
[5] King Abdulaziz Univ, CEGMR, Jeddah 21589, Saudi Arabia
来源
SCIENTIFIC REPORTS | 2016年 / 6卷
关键词
CANCER GENOMES; SNV DETECTION; WHOLE-EXOME; WEB SERVER; IDENTIFICATION; VARIANTS; MODES; DISCOVERY; PACKAGE; PSEKNC;
D O I
10.1038/srep36540
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Four popular somatic single nucleotide variant (SNV) calling methods (Varscan, SomaticSniper, Strelka and MuTect2) were carefully evaluated on the real whole exome sequencing (WES, depth of -50X) and ultra-deep targeted sequencing (UDT-Seq, depth of similar to 370X) data. The four tools returned poor consensus on candidates (only 20% of calls were with multiple hits by the callers). For both WES and UDT-Seq, MuTect2 and Strelka obtained the largest proportion of COSMIC entries as well as the lowest rate of dbSNP presence and high-alternative-alleles-in-control calls, demonstrating their superior sensitivity and accuracy. Combining different callers does increase reliability of candidates, but narrows the list down to very limited range of tumor read depth and variant allele frequency. Calling SNV on UDT-Seq data, which were of much higher read-depth, discovered additional true-positive variations, despite an even more tremendous growth in false positive predictions. Our findings not only provide valuable benchmark for state-of-the-art SNV calling methods, but also shed light on the access to more accurate SNV identification in the future.
引用
收藏
页数:9
相关论文
共 47 条
  • [41] Template-Based Models for Genome-Wide Analysis of Next-Generation Sequencing Data at Base-Pair Resolution
    Blocker, Alexander W.
    Airoldi, Edoardo M.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (515) : 967 - 987
  • [42] CNV_MCD: Detection of copy number variations based on minimum covariance determinant using next-generation sequencing data
    Li, Yaoyao
    Yang, Fangjia
    Xie, Kun
    DIGITAL SIGNAL PROCESSING, 2024, 154
  • [43] DDBJ Read Annotation Pipeline: A Cloud Computing-Based Pipeline for High-Throughput Analysis of Next-Generation Sequencing Data
    Nagasaki, Hideki
    Mochizuki, Takako
    Kodama, Yuichi
    Saruhashi, Satoshi
    Morizaki, Shota
    Sugawara, Hideaki
    Ohyanagi, Hajime
    Kurata, Nori
    Okubo, Kousaku
    Takagi, Toshihisa
    Kaminuma, Eli
    Nakamura, Yasukazu
    DNA RESEARCH, 2013, 20 (04) : 383 - 390
  • [44] Genetic identification of pathogenic variations of the DMD gene: a retrospective study from 10,481 neonatal patients based on next-generation sequencing data
    Xiao, Tiantian
    Wu, Bingbing
    Cao, Yun
    Liu, Renchao
    Cheng, Guoqiang
    Wang, Laishuan
    Zhuang, Deyi
    Zhao, Zhengyan
    Wang, Huijun
    Zhou, Wenhao
    ANNALS OF TRANSLATIONAL MEDICINE, 2021, 9 (09)
  • [45] CNV-MEANN: A Neural Network and Mind Evolutionary Algorithm-Based Detection of Copy Number Variations From Next-Generation Sequencing Data
    Huang, Tihao
    Li, Junqing
    Jia, Baoxian
    Sang, Hongyan
    FRONTIERS IN GENETICS, 2021, 12
  • [46] Exploration of an XX/XY Sex Determination System and Development of PCR-Based Sex-specific Markers in Procambarus clarkii Based on Next-Generation Sequencing Data
    Shen, Yudong
    Wang, Qishuai
    Wang, Weimin
    Li, Yanhe
    FRONTIERS IN GENETICS, 2022, 13
  • [47] De Novo Assembly-Based Analysis of RPGR Exon ORF15 in an Indigenous African Cohort Overcomes Limitations of a Standard Next-Generation Sequencing (NGS) Data Analysis Pipeline
    Maggi, Jordi
    Roberts, Lisa
    Koller, Samuel
    Rebello, George
    Berger, Wolfgang
    Ramesar, Rajkumar
    GENES, 2020, 11 (07) : 1 - 17