An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data

被引:172
|
作者
Shiraishi, Yuichi [1 ]
Sato, Yusuke [2 ,3 ]
Chiba, Kenichi [1 ]
Okuno, Yusuke [2 ]
Nagata, Yasunobu [2 ]
Yoshida, Kenichi [2 ]
Shiba, Norio [2 ,4 ]
Hayashi, Yasuhide [4 ]
Kume, Haruki [3 ]
Homma, Yukio [3 ]
Sanada, Masashi [2 ]
Ogawa, Seishi [2 ]
Miyano, Satoru [1 ]
机构
[1] Univ Tokyo, Lab DNA Informat Anal, Ctr Human Genome, Inst Med Sci,Minato Ku, Tokyo 1088639, Japan
[2] Univ Tokyo, Canc Genom Project, Grad Sch Med, Bunkyo Ku, Tokyo 1138655, Japan
[3] Univ Tokyo, Dept Urol, Grad Sch Med, Bunkyo Ku, Tokyo 1138655, Japan
[4] Gunma Childrens Med Ctr, Dept Hematol Oncol, Gunma 3770061, Japan
关键词
ALIGNMENT; EVOLUTION; VARIANTS;
D O I
10.1093/nar/gkt126
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in high-throughput sequencing technologies have enabled a comprehensive dissection of the cancer genome clarifying a large number of somatic mutations in a wide variety of cancer types. A number of methods have been proposed for mutation calling based on a large amount of sequencing data, which is accomplished in most cases by statistically evaluating the difference in the observed allele frequencies of possible single nucleotide variants between tumours and paired normal samples. However, an accurate detection of mutations remains a challenge under low sequencing depths or tumour contents. To overcome this problem, we propose a novel method, Empirical Bayesian mutation Calling ( ext-link-type="uri" xlink:href="https://github.com/friend1ws/EBCall" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/friend1ws/EBCall), for detecting somatic mutations. Unlike previous methods, the proposed method discriminates somatic mutations from sequencing errors based on an empirical Bayesian framework, where the model parameters are estimated using sequencing data from multiple non-paired normal samples. Using 13 whole-exome sequencing data with 87.5-206.3 mean sequencing depths, we demonstrate that our method not only outperforms several existing methods in the calling of mutations with moderate allele frequencies but also enables accurate calling of mutations with low allele frequencies (10%) harboured within a minor tumour subpopulation, thus allowing for the deciphering of fine substructures within a tumour specimen.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Characterization and validation of somatic mutation spectrum to reveal heterogeneity in gastric cancer by single cell sequencing
    Peng, Lihua
    Xing, Rui
    Liu, Dongbing
    Bao, Li
    Cheng, Wenxiang
    Wang, Hongyi
    Yu, Yuan
    Liu, Xiaofeng
    Jiang, Lu
    Wu, Yan
    An, Zhongxue
    Liang, Qiaoyi
    Kim, Ryong Nam
    Shin, Young Kee
    Yang, Huanming
    Wang, Jian
    Yu, Jun
    Zhang, Xiuqing
    Xu, Xun
    Yang, Jiaan
    Wu, Kui
    Zhu, Shida
    Lu, Youyong
    SCIENCE BULLETIN, 2019, 64 (04) : 236 - 244
  • [22] Detection and benchmarking of somatic mutations in cancer genomes using RNA-seq data
    Coudray, Alexandre
    Battenhouse, Anna M.
    Bucher, Philipp
    Iyer, Vishwanath R.
    PEERJ, 2018, 6
  • [23] SiCloneFit: Bayesian inference of population structure, genotype, and phylogeny of tumor clones from single-cell genome sequencing data
    Zafar, Hamim
    Navin, Nicholas
    Chen, Ken
    Nakhleh, Luay
    GENOME RESEARCH, 2019, 29 (11) : 1847 - 1859
  • [24] Bayesian estimation of bacterial community composition from 454 sequencing data
    Cheng, Lu
    Walker, Alan W.
    Corander, Jukka
    NUCLEIC ACIDS RESEARCH, 2012, 40 (12) : 5240 - 5249
  • [25] CNOGpro: detection and quantification of CNVs in prokaryotic whole-genome sequencing data
    Brynildsrud, Ola
    Snipen, Lars-Gustav
    Bohlin, Jon
    BIOINFORMATICS, 2015, 31 (11) : 1708 - 1715
  • [26] Potential damaging mutation in LRP5 from genome sequencing of the first reported chimpanzee with the Chiari malformation
    Solis-Moruno, Manuel
    de Manuel, Marc
    Hernandez-Rodriguez, Jessica
    Fontsere, Claudia
    Gomara-Castano, Alba
    Valsera-Naranjo, Cristina
    Crailsheim, Dietmar
    Navarro, Arcadi
    Llorente, Miquel
    Riera, Laura
    Feliu-Olleta, Olga
    Marques-Bonet, Tomas
    SCIENTIFIC REPORTS, 2017, 7
  • [27] Tool evaluation for the detection of variably sized indels from next generation whole genome and targeted sequencing data
    Wang, Ning
    Lysenkov, Vladislav
    Orte, Katri
    Kairisto, Veli
    Aakko, Juhani
    Khan, Sofia
    Elo, Laura L.
    PLOS COMPUTATIONAL BIOLOGY, 2022, 18 (02)
  • [28] Insights from Large-Scale Cancer Genome Sequencing
    Mardis, Elaine R.
    ANNUAL REVIEW OF CANCER BIOLOGY, VOL 2, 2018, 2 : 429 - 444
  • [29] The landscape of somatic mutation in cerebral cortex of autistic and neurotypical individuals revealed by ultra-deep whole-genome sequencing
    Rodin, Rachel E.
    Dou, Yanmei
    Kwon, Minseok
    Sherman, Maxwell A.
    D'Gama, Alissa M.
    Doan, Ryan N.
    Rento, Lariza M.
    Girskis, Kelly M.
    Bohrson, Craig L.
    Kim, Sonia N.
    Nadig, Ajay
    Luquette, Lovelace J.
    Gulhan, Doga C.
    Park, Peter J.
    Walsh, Christopher A.
    NATURE NEUROSCIENCE, 2021, 24 (02) : 176 - 185
  • [30] Somatic deleterious mutation rate in a woody plant: estimation from phenotypic data
    Bobiwash, K.
    Schultz, S. T.
    Schoen, D. J.
    HEREDITY, 2013, 111 (04) : 338 - 344