SVPath: an accurate pipeline for predicting the pathogenicity of human exon structural variants

被引:5
|
作者
Yang, Yaning [1 ]
Wang, Xiaoqi [1 ]
Zhou, Deshan [1 ,2 ]
Wei, Dong-Qing
Peng, Shaoliang [1 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China
[2] Shanghai Jiao Tong Univ, School Life Sci & Technol, Shanghai, Peoples R China
基金
国家重点研发计划;
关键词
structural variation; SNP; clinical pathogenic; machine learning; exome; GENETIC-VARIATION; DATABASE; MUTATIONS; FRAMEWORK; IMPACT;
D O I
10.1093/bib/bbac014
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Although there are a large number of structural variations in the chromosomes of each individual, there is a lack of more accurate methods for identifying clinical pathogenic variants. Here, we proposed SVPath, a machine learning-based method to predict the pathogenicity of deletions, insertions and duplications structural variations that occur in exons. We constructed three types of annotation features for each structural variation event in the ClinVar database. First, we treated complex structural variations as multiple consecutive single nucleotide polymorphisms events, and annotated them with correlation scores based on single nucleic acid substitutions, such as the impact on protein function. Second, we determined which genes the variation occurred in, and constructed gene-based annotation features for each structural variation. Third, we also calculated related features based on the transcriptome, such as histone signal, the overlap ratio of variation and genomic element definitions, etc. Finally, we employed a gradient boosting decision tree machine learning method, and used the deletions, insertions and duplications in the ClinVar database to train a structural variation pathogenicity prediction model SVPath. These structural variations are clearly indicated as pathogenic or benign. Experimental results show that our SVPath has achieved excellent predictive performance and outperforms existing state-of-the-art tools. SVPath is very promising in evaluating the clinical pathogenicity of structural variants. SVPath can be used in clinical research to predict the clinical significance of unknown pathogenicity and new structural variation, so as to explore the relationship between diseases and structural variations in a computational way.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] SMCHD1 genetic variants in type 2 facioscapulohumeral dystrophy and challenges in predicting pathogenicity and disease penetrance
    Gerard, Laurene
    Delourme, Megane
    Tardy, Charlotte
    Ganne, Benjamin
    Perrin, Pierre
    Chaix, Charlene
    Trani, Jean Philippe
    Eudes, Nathalie
    Laberthonniere, Camille
    Bertaux, Karine
    Missirian, Chantal
    Bassez, Guillaume
    Behin, Anthony
    Cintas, Pascal
    Cluse, Florent
    de la Cruz, Elisa
    Delmont, Emilien
    Evangelista, Teresinha
    Fradin, Melanie
    Hadouiri, Nawale
    Kouton, Ludivine
    Laforet, Pascal
    Lefeuvre, Claire
    Magot, Armelle
    Manel, Veronique
    Nectoux, Juliette
    Pegat, Antoine
    Sole, Guilhem
    Spinazzi, Marco
    Stojkovic, Tanya
    Svahn, Juliette
    Tard, Celine
    Thauvin, Christel
    Verebi, Camille
    Salort Campana, Emmanuelle
    Attarian, Shahram
    Nguyen, Karine
    Badache, Ali
    Bernard, Rafaelle
    Magdinier, Frederique
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024,
  • [32] Assessing the Pathogenicity of In-Frame CACNA1F Indel Variants Using Structural Modeling
    Sallah, Shalaw R.
    Sergouniotis, Panagiotis, I
    Hardcastle, Claire
    Ramsden, Simon
    Lotery, Andrew J.
    Lench, Nick
    Lovell, Simon C.
    Black, Graeme C. M.
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2022, 24 (12) : 1232 - 1239
  • [33] SUMMER: an integrated nanopore sequencing pipeline for variants detection and clinical annotation on the human genome
    Li, Renqiuguo
    Chu, Hongyuan
    Gao, Kai
    Luo, Huaxia
    Jiang, Yuwu
    FUNCTIONAL & INTEGRATIVE GENOMICS, 2025, 25 (01)
  • [34] A high-quality human reference panel reveals the complexity and distribution of genomic structural variants
    Hehir-Kwa, Jayne Y.
    Marschall, Tobias
    Kloosterman, Wigard P.
    Francioli, Laurent C.
    Baaijens, Jasmijn A.
    Dijkstra, Louis J.
    Abdellaoui, Abdel
    Koval, Vyacheslav
    Thung, Djie Tjwan
    Wardenaar, Rene
    Renkens, Ivo
    Coe, Bradley P.
    Deelen, Patrick
    de Ligt, Joep
    Lameijer, Eric-Wubbo
    van Dijk, Freerk
    Hormozdiari, Fereydoun
    Uitterlinden, Andre G.
    van Duijn, Cornelia M.
    Eichler, Evan E.
    de Bakker, Paul I. W.
    Swertz, Morris A.
    Wijmenga, Cisca
    van Ommen, Gert-Jan B.
    Slagboom, P. Eline
    Boomsma, Dorret I.
    Schonhuth, Alexander
    Ye, Kai
    Guryev, Victor
    NATURE COMMUNICATIONS, 2016, 7
  • [35] VarQ: A Tool for the Structural and Functional Analysis of Human Protein Variants
    Radusky, Leandro
    Modenutti, Carlos
    Delgado, Javier
    Bustamante, Juan P.
    Vishnopolska, Sebastian
    Kiel, Christina
    Serrano, Luis
    Marti, Marcelo
    Turjanski, Adrian
    FRONTIERS IN GENETICS, 2018, 9
  • [36] An Efficient Pipeline for the Generation and Functional Analysis of Human BRCA2 Variants of Uncertain Significance
    Hendriks, Giel
    Morolli, Bruno
    Calleja, Fabienne M. G. R.
    Plomp, Anouk
    Mesman, Romy L. S.
    Meijers, Matty
    Sharan, Shyam K.
    Vreeswijk, Maaike P. G.
    Vrieling, Harry
    HUMAN MUTATION, 2014, 35 (11) : 1382 - 1391
  • [37] A Massively Parallel Pipeline to Clone DNA Variants and Examine Molecular Phenotypes of Human Disease Mutations
    Wei, Xiaomu
    Das, Jishnu
    Fragoza, Robert
    Liang, Jin
    de Oliveira, Francisco M. Bastos
    Lee, Hao Ran
    Wang, Xiujuan
    Mort, Matthew
    Stenson, Peter D.
    Cooper, David N.
    Lipkin, Steven M.
    Smolka, Marcus B.
    Yu, Haiyuan
    PLOS GENETICS, 2014, 10 (12):
  • [38] NanoVar: accurate characterization of patients' genomic structural variants using low-depth nanopore sequencing
    Tham, Cheng Yong
    Tirado-Magallanes, Roberto
    Goh, Yufen
    Fullwood, Melissa J.
    Koh, Bryan T. H.
    Wang, Wilson
    Ng, Chin Hin
    Chng, Wee Joo
    Thiery, Alexandre
    Tenen, Daniel G.
    Benoukraf, Touati
    GENOME BIOLOGY, 2020, 21 (01)
  • [39] An Accurate Online Consensus Tool to Interpret Newborn Screening-Related Genetic Variants in Structural Context
    Jose Galano-Frutos, Juan
    Garcia-Cebollada, Helena
    Lopez, Alfonso
    Rosell, Mireia
    de la Cruz, Xavier
    Fernandez-Recio, Juan
    Sancho, Javier
    JOURNAL OF MOLECULAR DIAGNOSTICS, 2022, 24 (04) : 406 - 425
  • [40] Mechanism and modeling of human disease-associated near-exon intronic variants that perturb RNA splicing
    Chiang, Hung-Lun
    Chen, Yi-Ting
    Su, Jia-Ying
    Lin, Hsin-Nan
    Yu, Chen-Hsin Albert
    Hung, Yu-Jen
    Wang, Yun-Lin
    Huang, Yen-Tsung
    Lin, Chien-Ling
    NATURE STRUCTURAL & MOLECULAR BIOLOGY, 2022, 29 (11) : 1043 - +