A machine learning approach for somatic mutation discovery

被引:74
|
作者
Wood, Derrick E. [1 ]
White, James R. [1 ]
Georgiadis, Andrew [1 ]
Van Emburgh, Beth [1 ]
Parpart-Li, Sonya [1 ]
Mitchell, Jason [1 ]
Anagnostou, Valsamo [2 ]
Niknafs, Noushin [2 ]
Karchin, Rachel [2 ,3 ]
Papp, Eniko [1 ]
McCord, Christine [1 ]
LoVerso, Peter [1 ]
Riley, David [1 ]
Diaz, Luis A., Jr. [4 ]
Jones, Sian [1 ]
Sausen, Mark [1 ]
Velculescu, Victor E. [2 ]
Angiuoli, Samuel V. [1 ]
机构
[1] Personal Genome Diagnost, Baltimore, MD 21224 USA
[2] Johns Hopkins Univ, Sch Med, Sidney Kimmel Comprehens Canc Ctr, Baltimore, MD 21287 USA
[3] Johns Hopkins Univ, Dept Biomed Engn, Inst Computat Med, Baltimore, MD 21218 USA
[4] Mem Sloan Kettering Canc Ctr, New York, NY 10065 USA
关键词
COMPREHENSIVE MOLECULAR CHARACTERIZATION; GENERATION SEQUENCING PANEL; GENOMIC CHARACTERIZATION; CLINICAL VALIDATION; POINT MUTATIONS; READ ALIGNMENT; HUMAN BREAST; OPEN-LABEL; CANCER; TUMOR;
D O I
10.1126/scitranslmed.aar7939
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Variability in the accuracy of somatic mutation detection may affect the discovery of alterations and the therapeutic management of cancer patients. To address this issue, we developed a somatic mutation discovery approach based on machine learning that outperformed existing methods in identifying experimentally validated tumor alterations (sensitivity of 97% versus 90 to 99%; positive predictive value of 98% versus 34 to 92%). Analysis of paired tumor-normal exome data from 1368 TCGA (The Cancer Genome Atlas) samples using this method revealed concordance for 74% of mutation calls but also identified likely false-positive and false-negative changes in TCGA data, including in clinically actionable genes. Determination of high-quality somatic mutation calls improved tumor mutation load-based predictions of clinical outcome for melanoma and lung cancer patients previously treated with immune checkpoint inhibitors. Integration of high-quality machine learning mutation detection in clinical next-generation sequencing (NGS) analyses increased the accuracy of test results compared to other clinical sequencing analyses. These analyses provide an approach for improved identification of tumor-specific mutations and have important implications for research and clinical management of cancer patients.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] A machine learning approach for somatic mutation discovery
    Wood, Derrick
    White, James
    Georgiadis, Andrew
    Van Emburgh, Beth
    Parpart-Li, Sonya
    Mitchell, Jason
    Anagnostou, Valsamo
    Niknafs, Noushin
    Karchin, Rachel
    Papp, Eniko
    McCord, Christine
    Loverso, Peter
    Riley, David
    Diaz, Luis A.
    Jones, Sian
    Sausen, Mark
    Velculescu, Victor E.
    Angiuoli, Samuel
    CANCER RESEARCH, 2018, 78 (13)
  • [2] Somatic Mutation Detection Using Ensemble of Machine Learning
    Yu, Xingyu
    Li, Xiang
    Tong, Jijun
    Yang, Bin
    ADVANCED INTELLIGENT COMPUTING IN BIOINFORMATICS, PT II, ICIC 2024, 2024, 14882 : 444 - 453
  • [3] Machine learning approach to electrocatalyst discovery
    Xin, Hongliang
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2017, 253
  • [4] Database dependency discovery: a machine learning approach
    Flach, PA
    Savnik, I
    AI COMMUNICATIONS, 1999, 12 (03) : 139 - 160
  • [5] Machine-learning approach for discovery of conventional superconductors
    Tran, Huan
    Vu, Tuoc N.
    PHYSICAL REVIEW MATERIALS, 2023, 7 (05)
  • [6] A Machine Learning Approach to Service Discovery for Microservice Architectures
    Caporuscio, Mauro
    De Toma, Marco
    Muccini, Henry
    Vaidhyanathan, Karthik
    SOFTWARE ARCHITECTURE, ECSA 2021, 2021, 12857 : 66 - 82
  • [7] Using machine learning to predict tissue of origin from somatic mutation features
    Giorni, Andrea
    Sivasubramiam, Prabu
    Kubeyev, Aidan
    Laurie, Jordan
    Silva, Luiz
    Foster, Matthew
    Asghar, Uzma
    Griffiths, Matthew
    CANCER RESEARCH, 2023, 83 (07)
  • [8] A machine learning based approach for phononic crystal property discovery
    Sadat, Seid M.
    Wang, Robert Y.
    JOURNAL OF APPLIED PHYSICS, 2020, 128 (02)
  • [9] Demographics and Personality Discovery on Social Media: A Machine Learning Approach
    Tuomchomtam, Sarach
    Soonthornphisaj, Nuanwan
    INFORMATION, 2021, 12 (09)
  • [10] Combining Mutation and Gene Network Data in a Machine Learning Approach for False-Positive Cancer Driver Gene Discovery
    Cutigi, Jorge Francisco
    Evangelista, Renato Feijo
    Ramos, Rodrigo Henrique
    Lage Ferreira, Cynthia de Oliveira
    Evangelista, Adriane Feijo
    de Carvalho, Andre C. P. L. F.
    Simao, Adenilso
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2020, 2020, 12558 : 81 - 92