An empirical Bayesian framework for somatic mutation detection from cancer genome sequencing data

被引:172
|
作者
Shiraishi, Yuichi [1 ]
Sato, Yusuke [2 ,3 ]
Chiba, Kenichi [1 ]
Okuno, Yusuke [2 ]
Nagata, Yasunobu [2 ]
Yoshida, Kenichi [2 ]
Shiba, Norio [2 ,4 ]
Hayashi, Yasuhide [4 ]
Kume, Haruki [3 ]
Homma, Yukio [3 ]
Sanada, Masashi [2 ]
Ogawa, Seishi [2 ]
Miyano, Satoru [1 ]
机构
[1] Univ Tokyo, Lab DNA Informat Anal, Ctr Human Genome, Inst Med Sci,Minato Ku, Tokyo 1088639, Japan
[2] Univ Tokyo, Canc Genom Project, Grad Sch Med, Bunkyo Ku, Tokyo 1138655, Japan
[3] Univ Tokyo, Dept Urol, Grad Sch Med, Bunkyo Ku, Tokyo 1138655, Japan
[4] Gunma Childrens Med Ctr, Dept Hematol Oncol, Gunma 3770061, Japan
关键词
ALIGNMENT; EVOLUTION; VARIANTS;
D O I
10.1093/nar/gkt126
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in high-throughput sequencing technologies have enabled a comprehensive dissection of the cancer genome clarifying a large number of somatic mutations in a wide variety of cancer types. A number of methods have been proposed for mutation calling based on a large amount of sequencing data, which is accomplished in most cases by statistically evaluating the difference in the observed allele frequencies of possible single nucleotide variants between tumours and paired normal samples. However, an accurate detection of mutations remains a challenge under low sequencing depths or tumour contents. To overcome this problem, we propose a novel method, Empirical Bayesian mutation Calling ( ext-link-type="uri" xlink:href="https://github.com/friend1ws/EBCall" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/friend1ws/EBCall), for detecting somatic mutations. Unlike previous methods, the proposed method discriminates somatic mutations from sequencing errors based on an empirical Bayesian framework, where the model parameters are estimated using sequencing data from multiple non-paired normal samples. Using 13 whole-exome sequencing data with 87.5-206.3 mean sequencing depths, we demonstrate that our method not only outperforms several existing methods in the calling of mutations with moderate allele frequencies but also enables accurate calling of mutations with low allele frequencies (10%) harboured within a minor tumour subpopulation, thus allowing for the deciphering of fine substructures within a tumour specimen.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Improved estimation of macroevolutionary rates from fossil data using a Bayesian framework
    Silvestro, Daniele
    Salamin, Nicolas
    Antonelli, Alexandre
    Meyer, Xavier
    PALEOBIOLOGY, 2019, 45 (04) : 546 - 570
  • [42] In-depth comparison of somatic point mutation callers based on different tumor next-generation sequencing depth data
    Cai, Lei
    Yuan, Wei
    Zhang, Zhou
    He, Lin
    Chou, Kuo-Chen
    SCIENTIFIC REPORTS, 2016, 6
  • [43] Personalized genome assembly for accurate cancer somatic mutation discovery using tumor-normal paired reference samples
    Xiao, Chunlin
    Chen, Zhong
    Chen, Wanqiu
    Padilla, Cory
    Colgan, Michael
    Wu, Wenjun
    Fang, Li-Tai
    Liu, Tiantian
    Yang, Yibin
    Schneider, Valerie
    Wang, Charles
    Xiao, Wenming
    GENOME BIOLOGY, 2022, 23 (01)
  • [44] Prediction of the 3D cancer genome from whole-genome sequencing using InfoHiC
    Lee, Yeonghun
    Park, Sung-Hye
    Lee, Hyunju
    MOLECULAR SYSTEMS BIOLOGY, 2024, 20 (11) : 1156 - 1172
  • [45] An integrative framework for clinical diagnosis and knowledge discovery from exome sequencing data
    Shojaei, Mona
    Mohammadvand, Navid
    Dogan, Tunca
    Alkan, Can
    Atalay, Renguel cetin
    Acar, Aybar C.
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 169
  • [46] Detection of somatic variants in peripheral blood lymphocytes using a next generation sequencing multigene pan cancer panel
    Coffee, Bradford
    Cox, Hannah C.
    Kidd, John
    Sizemore, Scott
    Brown, Krystal
    Manley, Susan
    Mancini-DiNardo, Debora
    CANCER GENETICS, 2017, 211 : 5 - 8
  • [47] Reconstruction of the personal information from human genome reads in gut metagenome sequencing data
    Tomofuji, Yoshihiko
    Sonehara, Kyuto
    Kishikawa, Toshihiro
    Maeda, Yuichi
    Ogawa, Kotaro
    Kawabata, Shuhei
    Nii, Takuro
    Okuno, Tatsusada
    Oguro-Igashira, Eri
    Kinoshita, Makoto
    Takagaki, Masatoshi
    Yamamoto, Kenichi
    Kurakawa, Takashi
    Yagita-Sakamaki, Mayu
    Hosokawa, Akiko
    Motooka, Daisuke
    Matsumoto, Yuki
    Matsuoka, Hidetoshi
    Yoshimura, Maiko
    Ohshima, Shiro
    Nakamura, Shota
    Inohara, Hidenori
    Kishima, Haruhiko
    Mochizuki, Hideki
    Takeda, Kiyoshi
    Kumanogoh, Atsushi
    Okada, Yukinori
    NATURE MICROBIOLOGY, 2023, 8 (06) : 1079 - +
  • [48] ConPADE: Genome Assembly Ploidy Estimation from Next-Generation Sequencing Data
    Margarido, Gabriel R. A.
    Heckerman, David
    PLOS COMPUTATIONAL BIOLOGY, 2015, 11 (04)
  • [49] A practical method to detect SNVs and indels from whole genome and exome sequencing data
    Shigemizu, Daichi
    Fujimoto, Akihiro
    Akiyama, Shintaro
    Abe, Tetsuo
    Nakano, Kaoru
    Boroevich, Keith A.
    Yamamoto, Yujiro
    Furuta, Mayuko
    Kubo, Michiaki
    Nakagawa, Hidewaki
    Tsunoda, Tatsuhiko
    SCIENTIFIC REPORTS, 2013, 3
  • [50] Inferring patient to patient transmission of Mycobacterium tuberculosis from whole genome sequencing data
    Bryant, Josephine M.
    Schurch, Anita C.
    van Deutekom, Henk
    Harris, Simon R.
    de Beer, Jessica L.
    de Jager, Victor
    Kremer, Kristin
    van Hijum, Sacha A. F. T.
    Siezen, Roland J.
    Borgdorff, Martien
    Bentley, Stephen D.
    Parkhill, Julian
    van Soolingen, Dick
    BMC INFECTIOUS DISEASES, 2013, 13