SVFX: a machine learning framework to quantify the pathogenicity of structural variants

被引:21
|
作者
Kumar, Sushant [1 ,2 ]
Harmanci, Arif [3 ]
Vytheeswaran, Jagath [4 ]
Gerstein, Mark B. [1 ,2 ,5 ]
机构
[1] Yale Univ, Program Computat Biol & Bioinformat, New Haven, CT 06520 USA
[2] Yale Univ, Dept Mol Biophys & Biochem, POB 6666, New Haven, CT 06520 USA
[3] Univ Texas Hlth Sci Ctr Houston, Sch Biomed Informat, Ctr Precis Hlth, Houston, TX 77030 USA
[4] CALTECH, Dept Comp & Math Sci, Pasadena, CA 91125 USA
[5] Yale Univ, Dept Comp Sci, 260-266 Whitney Ave,POB 208114, New Haven, CT 06520 USA
基金
美国国家卫生研究院;
关键词
IMPACT; SETD3; MUTATIONS;
D O I
10.1186/s13059-020-02178-x
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
There is a lack of approaches for identifying pathogenic genomic structural variants (SVs) although they play a crucial role in many diseases. We present a mechanism-agnostic machine learning-based workflow, called SVFX, to assign pathogenicity scores to somatic and germline SVs. In particular, we generate somatic and germline training models, which include genomic, epigenomic, and conservation-based features, for SV call sets in diseased and healthy individuals. We then apply SVFX to SVs in cancer and other diseases; SVFX achieves high accuracy in identifying pathogenic SVs. Predicted pathogenic SVs in cancer cohorts are enriched among known cancer genes and many cancer-related pathways.
引用
收藏
页数:21
相关论文
共 50 条
  • [21] MutPred Splice: machine learning-based prediction of exonic variants that disrupt splicing
    Mort, Matthew
    Sterne-Weiler, Timothy
    Li, Biao
    Ball, Edward V.
    Cooper, David N.
    Radivojac, Predrag
    Sanford, Jeremy R.
    Mooney, Sean D.
    GENOME BIOLOGY, 2014, 15 (01):
  • [22] VEPAD - Predicting the effect of variants associated with Alzheimer's disease using machine learning
    Rangaswamy, Uday
    Dharshini, S. Akila Parvathy
    Yesudhas, Dhanusha
    Gromiha, M. Michael
    COMPUTERS IN BIOLOGY AND MEDICINE, 2020, 124
  • [23] The UK Research Excellence Framework and the Matthew effect: Insights from machine learning
    Balbuena, Lloyd D.
    PLOS ONE, 2018, 13 (11):
  • [24] CNVoyant a machine learning framework for accurate and explainable copy number variant classification
    Schuetz, Robert J.
    Ceyhan, Defne
    Antoniou, Austin A.
    Chaudhari, Bimal P.
    White, Peter
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [25] Predicting article quality scores with machine learning: The UK Research Excellence Framework
    Thelwall, Mike
    Kousha, Kayvan
    Wilson, Paul
    Makita, Meiko
    Abdoli, Mahshid
    Stuart, Emma
    Levitt, Jonathan
    Knoth, Petr
    Cancellieri, Matteo
    QUANTITATIVE SCIENCE STUDIES, 2023, 4 (02): : 547 - 573
  • [26] A machine learning framework for the prediction of grain boundary segregation in chemically complex environments
    Aksoy, Doruk
    Luo, Jian
    Cao, Penghui
    Rupert, Timothy J.
    MODELLING AND SIMULATION IN MATERIALS SCIENCE AND ENGINEERING, 2024, 32 (06)
  • [27] Estimating nitrogen and phosphorus concentrations in streams and rivers, within a machine learning framework
    Shen, Longzhu Q.
    Amatulli, Giuseppe
    Sethi, Tushar
    Raymond, Peter
    Domisch, Sami
    SCIENTIFIC DATA, 2020, 7 (01)
  • [28] Predicting hotspots for disease-causing single nucleotide variants using sequences-based coevolution, network analysis, and machine learning
    Zheng, Wenjun
    PLOS ONE, 2024, 19 (05):
  • [29] A Machine Learning Enhanced Mechanistic Simulation Framework for Functional Deficit Prediction in TBI
    Schroder, Anna
    Lawrence, Tim
    Voets, Natalie
    Garcia-Gonzalez, Daniel
    Jones, Mike
    Pena, Jose-Maria
    Jerusalem, Antoine
    FRONTIERS IN BIOENGINEERING AND BIOTECHNOLOGY, 2021, 9
  • [30] A machine-learning approach for accurate detection of copy number variants from exome sequencing
    Pounraja, Vijay Kumar
    Jayakar, Gopal
    Jensen, Matthew
    Kelkar, Neil
    Girirajan, Santhosh
    GENOME RESEARCH, 2019, 29 (07) : 1134 - 1143