SVPath: an accurate pipeline for predicting the pathogenicity of human exon structural variants

被引:5
|
作者
Yang, Yaning [1 ]
Wang, Xiaoqi [1 ]
Zhou, Deshan [1 ,2 ]
Wei, Dong-Qing
Peng, Shaoliang [1 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China
[2] Shanghai Jiao Tong Univ, School Life Sci & Technol, Shanghai, Peoples R China
基金
国家重点研发计划;
关键词
structural variation; SNP; clinical pathogenic; machine learning; exome; GENETIC-VARIATION; DATABASE; MUTATIONS; FRAMEWORK; IMPACT;
D O I
10.1093/bib/bbac014
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Although there are a large number of structural variations in the chromosomes of each individual, there is a lack of more accurate methods for identifying clinical pathogenic variants. Here, we proposed SVPath, a machine learning-based method to predict the pathogenicity of deletions, insertions and duplications structural variations that occur in exons. We constructed three types of annotation features for each structural variation event in the ClinVar database. First, we treated complex structural variations as multiple consecutive single nucleotide polymorphisms events, and annotated them with correlation scores based on single nucleic acid substitutions, such as the impact on protein function. Second, we determined which genes the variation occurred in, and constructed gene-based annotation features for each structural variation. Third, we also calculated related features based on the transcriptome, such as histone signal, the overlap ratio of variation and genomic element definitions, etc. Finally, we employed a gradient boosting decision tree machine learning method, and used the deletions, insertions and duplications in the ClinVar database to train a structural variation pathogenicity prediction model SVPath. These structural variations are clearly indicated as pathogenic or benign. Experimental results show that our SVPath has achieved excellent predictive performance and outperforms existing state-of-the-art tools. SVPath is very promising in evaluating the clinical pathogenicity of structural variants. SVPath can be used in clinical research to predict the clinical significance of unknown pathogenicity and new structural variation, so as to explore the relationship between diseases and structural variations in a computational way.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Analysis of missense variants in the human genome reveals widespread gene-specific clustering and improves prediction of pathogenicity
    Quinodoz, Mathieu
    Peter, Virginie G.
    Cisarova, Katarina
    Royer-Bertrand, Beryl
    Stenson, Peter D.
    Cooper, David N.
    Unger, Sheila
    Superti-Furga, Andrea
    Rivolta, Carlo
    AMERICAN JOURNAL OF HUMAN GENETICS, 2022, 109 (03) : 457 - 470
  • [42] MitImpact: an Exhaustive Collection of Pre-computed Pathogenicity Predictions of Human Mitochondrial Non-synonymous Variants
    Castellana, Stefano
    Ronai, Judit
    Mazza, Tommaso
    HUMAN MUTATION, 2015, 36 (02) : E2413 - E2422
  • [43] A multi-dimensional integrative scoring framework for predicting functional variants in the human genome
    Li, Xihao
    Young, Godwin
    Zhou, Hufeng
    Sun, Ryan
    Li, Zilin
    Hou, Kangcheng
    Zhang, Martin Jinye
    Liu, Yaowu
    Arapoglou, Theodore
    Wang, Chen
    Ionita-Laza, Iuliana
    Lin, Xihong
    AMERICAN JOURNAL OF HUMAN GENETICS, 2022, 109 (03) : 446 - 456
  • [44] Predicting the short-term success of human influenza virus variants with machine learning
    Hayati, Maryam
    Biller, Priscila
    Colijn, Caroline
    PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2020, 287 (1924)
  • [45] Drosophila CRISPR/Cas9 mutants as tools to analyse cardiac filamin function and pathogenicity of human FLNC variants
    Ader, Flavie
    Russi, Maria
    Tixier-Cardoso, Laura
    Jullian, Estelle
    Martin, Elodie
    Richard, Pascale
    Villard, Eric
    Monnier, Veronique
    BIOLOGY OPEN, 2022, 11 (09):
  • [46] Evaluation of deep learning for predicting rice traits using structural and single-nucleotide genomic variants
    Vourlaki, Ioanna-Theoni
    Ramos-Onsins, Sebastian E.
    Perez-Enciso, Miguel
    Castanera, Raul
    PLANT METHODS, 2024, 20 (01)
  • [47] Integrated rules classifier for predicting pathogenic non-synonymous single nucleotide variants in human
    Hassan, Marwa S.
    Shaalan, A. A.
    Khamis, Shymaa
    Barakat, Ahmed
    Dessouky, M. I.
    GENE REPORTS, 2024, 34
  • [48] Pathogenicity of new BEST1 variants identified in Italian patients with best vitelliform macular dystrophy assessed by computational structural biology
    Frecer, Vladimir
    Iarossi, Giancarlo
    Salvetti, Anna Paola
    Maltese, Paolo Enrico
    Delledonne, Giulia
    Oldani, Marta
    Staurenghi, Giovanni
    Falsini, Benedetto
    Minnella, Angelo Maria
    Ziccardi, Lucia
    Magli, Adriano
    Colombo, Leonardo
    D'Esposito, Fabiana
    Miertus, Jan
    Viola, Francesco
    Attanasio, Marcella
    Maggio, Emilia
    Bertelli, Matteo
    JOURNAL OF TRANSLATIONAL MEDICINE, 2019, 17 (01)
  • [49] Predicting human and viral protein variants affecting COVID-19 susceptibility and repurposing therapeutics
    Waman, Vaishali P.
    Ashford, Paul
    Lam, Su Datt
    Sen, Neeladri
    Abbasian, Mahnaz
    Woodridge, Laurel
    Goldtzvik, Yonathan
    Bordin, Nicola
    Wu, Jiaxin
    Sillitoe, Ian
    Orengo, Christine A.
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [50] Structure-based pathogenicity relationship identifier for predicting effects of single missense variants and discovery of higher-order cancer susceptibility clusters of mutations
    Wang, Boshen
    Lei, Xue
    Tian, Wei
    Perez-Rathke, Alan
    Tseng, Yan-Yuan
    Liang, Jie
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (04)