SVPath: an accurate pipeline for predicting the pathogenicity of human exon structural variants

被引:5
|
作者
Yang, Yaning [1 ]
Wang, Xiaoqi [1 ]
Zhou, Deshan [1 ,2 ]
Wei, Dong-Qing
Peng, Shaoliang [1 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China
[2] Shanghai Jiao Tong Univ, School Life Sci & Technol, Shanghai, Peoples R China
基金
国家重点研发计划;
关键词
structural variation; SNP; clinical pathogenic; machine learning; exome; GENETIC-VARIATION; DATABASE; MUTATIONS; FRAMEWORK; IMPACT;
D O I
10.1093/bib/bbac014
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Although there are a large number of structural variations in the chromosomes of each individual, there is a lack of more accurate methods for identifying clinical pathogenic variants. Here, we proposed SVPath, a machine learning-based method to predict the pathogenicity of deletions, insertions and duplications structural variations that occur in exons. We constructed three types of annotation features for each structural variation event in the ClinVar database. First, we treated complex structural variations as multiple consecutive single nucleotide polymorphisms events, and annotated them with correlation scores based on single nucleic acid substitutions, such as the impact on protein function. Second, we determined which genes the variation occurred in, and constructed gene-based annotation features for each structural variation. Third, we also calculated related features based on the transcriptome, such as histone signal, the overlap ratio of variation and genomic element definitions, etc. Finally, we employed a gradient boosting decision tree machine learning method, and used the deletions, insertions and duplications in the ClinVar database to train a structural variation pathogenicity prediction model SVPath. These structural variations are clearly indicated as pathogenic or benign. Experimental results show that our SVPath has achieved excellent predictive performance and outperforms existing state-of-the-art tools. SVPath is very promising in evaluating the clinical pathogenicity of structural variants. SVPath can be used in clinical research to predict the clinical significance of unknown pathogenicity and new structural variation, so as to explore the relationship between diseases and structural variations in a computational way.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Rhapsody: predicting the pathogenicity of human missense variants
    Ponzoni, Luca
    Penaherrera, Daniel A.
    Oltvai, Zoltan N.
    Bahar, Ivet
    BIOINFORMATICS, 2020, 36 (10) : 3084 - 3092
  • [2] Predicting pathogenicity of missense variants with weakly supervised regression
    Cao, Yue
    Sun, Yuanfei
    Karimi, Mostafa
    Chen, Haoran
    Moronfoye, Oluwaseyi
    Shen, Yang
    HUMAN MUTATION, 2019, 40 (09) : 1579 - 1592
  • [3] StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants
    Sharo, Andrew G.
    Hu, Zhiqiang
    Sunyaev, Shamil R.
    Brenner, Steven E.
    AMERICAN JOURNAL OF HUMAN GENETICS, 2022, 109 (02) : 195 - 209
  • [4] iFish: predicting the pathogenicity of human nonsynonymous variants using gene-specific/family-specific attributes and classifiers
    Wang, Meng
    Wei, Liping
    SCIENTIFIC REPORTS, 2016, 6
  • [5] SvAnna: efficient and accurate pathogenicity prediction of coding and regulatory structural variants in long-read genome sequencing
    Danis, Daniel
    Jacobsen, Julius O. B.
    Balachandran, Parithi
    Zhu, Qihui
    Yilmaz, Feyza
    Reese, Justin
    Haimel, Matthias
    Lyon, Gholson J.
    Helbig, Ingo
    Mungall, Christopher J.
    Beck, Christine R.
    Lee, Charles
    Smedley, Damian
    Robinson, Peter N.
    GENOME MEDICINE, 2022, 14 (01)
  • [6] REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants
    Ioannidis, Nilah M.
    Rothstein, Joseph H.
    Pejaver, Vikas
    Middha, Sumit
    McDonnell, Shannon K.
    Baheti, Saurabh
    Musolf, Anthony
    Li, Qing
    Holzinger, Emily
    Karyadi, Danielle
    Cannon-Albright, Lisa A.
    Teerlink, Craig C.
    Stanford, Janet L.
    Isaacs, William B.
    Xu, Jianfeng
    Cooney, Kathleen A.
    Lange, Ethan M.
    Schleutker, Johanna
    Carpten, John D.
    Powell, Isaac J.
    Cussenot, Olivier
    Cancel-Tassin, Geraldine
    Giles, Graham G.
    MacInnis, Robert J.
    Maier, Christiane
    Hsieh, Chih-Lin
    Wiklund, Fredrik
    Catalona, William J.
    Foulkes, William D.
    Mandal, Diptasri
    Eeles, Rosalind A.
    Kote-Jarai, Zsofia
    Bustamante, Carlos D.
    Schaid, Daniel J.
    Hastie, Trevor
    Ostrander, Elaine A.
    Bailey-Wilson, Joan E.
    Radivojac, Predrag
    Thibodeau, Stephen N.
    Whittemore, Alice S.
    Sieh, Weiva
    AMERICAN JOURNAL OF HUMAN GENETICS, 2016, 99 (04) : 877 - 885
  • [7] LYRUS: a machine learning model for predicting the pathogenicity of missense variants
    Lai, Jiaying
    Yang, Jordan
    Gamsiz Uzun, Ece D.
    Rubenstein, Brenda M.
    Sarkar, Indra Neil
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [8] MAPPIN: a method for annotating, predicting pathogenicity and mode of inheritance for nonsynonymous variants
    Gosalia, Nehal
    Economides, Aris N.
    Dewey, Frederick E.
    Balasubramanian, Suganthi
    NUCLEIC ACIDS RESEARCH, 2017, 45 (18) : 10393 - 10402
  • [9] CADD: predicting the deleteriousness of variants throughout the human genome
    Rentzsch, Philipp
    Witten, Daniela
    Cooper, Gregory M.
    Shendure, Jay
    Kircher, Martin
    NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) : D886 - D894
  • [10] SVFX: a machine learning framework to quantify the pathogenicity of structural variants
    Kumar, Sushant
    Harmanci, Arif
    Vytheeswaran, Jagath
    Gerstein, Mark B.
    GENOME BIOLOGY, 2020, 21 (01) : 274