SVPath: an accurate pipeline for predicting the pathogenicity of human exon structural variants

被引:5
|
作者
Yang, Yaning [1 ]
Wang, Xiaoqi [1 ]
Zhou, Deshan [1 ,2 ]
Wei, Dong-Qing
Peng, Shaoliang [1 ]
机构
[1] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China
[2] Shanghai Jiao Tong Univ, School Life Sci & Technol, Shanghai, Peoples R China
基金
国家重点研发计划;
关键词
structural variation; SNP; clinical pathogenic; machine learning; exome; GENETIC-VARIATION; DATABASE; MUTATIONS; FRAMEWORK; IMPACT;
D O I
10.1093/bib/bbac014
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Although there are a large number of structural variations in the chromosomes of each individual, there is a lack of more accurate methods for identifying clinical pathogenic variants. Here, we proposed SVPath, a machine learning-based method to predict the pathogenicity of deletions, insertions and duplications structural variations that occur in exons. We constructed three types of annotation features for each structural variation event in the ClinVar database. First, we treated complex structural variations as multiple consecutive single nucleotide polymorphisms events, and annotated them with correlation scores based on single nucleic acid substitutions, such as the impact on protein function. Second, we determined which genes the variation occurred in, and constructed gene-based annotation features for each structural variation. Third, we also calculated related features based on the transcriptome, such as histone signal, the overlap ratio of variation and genomic element definitions, etc. Finally, we employed a gradient boosting decision tree machine learning method, and used the deletions, insertions and duplications in the ClinVar database to train a structural variation pathogenicity prediction model SVPath. These structural variations are clearly indicated as pathogenic or benign. Experimental results show that our SVPath has achieved excellent predictive performance and outperforms existing state-of-the-art tools. SVPath is very promising in evaluating the clinical pathogenicity of structural variants. SVPath can be used in clinical research to predict the clinical significance of unknown pathogenicity and new structural variation, so as to explore the relationship between diseases and structural variations in a computational way.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] In vivo bioassay to test the pathogenicity of missense human AIP variants
    Aflorei, Elena Daniela
    Klapholz, Benjamin
    Chen, Chenghao
    Radian, Serban
    Dragu, Anca Neluta
    Moderau, Nina
    Prodromou, Chrisostomos
    Ribeiro, Paulo S.
    Stanewsky, Ralf
    Korbonits, Marta
    JOURNAL OF MEDICAL GENETICS, 2018, 55 (08) : 522 - 529
  • [22] A general framework for estimating the relative pathogenicity of human genetic variants
    Kircher, Martin
    Witten, Daniela M.
    Jain, Preti
    O'Roak, Brian J.
    Cooper, Gregory M.
    Shendure, Jay
    NATURE GENETICS, 2014, 46 (03) : 310 - +
  • [23] Saturation genome editing of 11 codons and exon 13 of BRCA2 coupled with chemotherapeutic drug response accurately determines pathogenicity of variants
    Sahu, Sounak
    Sullivan, Teresa L.
    Mitrophanov, Alexander Y.
    Galloux, Melissa
    Nousome, Darryl
    Southon, Eileen
    Caylor, Dylan
    Mishra, Arun Prakash
    Evans, Christine N.
    Clapp, Michelle E.
    Burkett, Sandra
    Malys, Tyler
    Chari, Raj
    Biswas, Kajal
    Sharan, Shyam K.
    PLOS GENETICS, 2023, 19 (09):
  • [24] KVarPredDB: a database for predicting pathogenicity of missense sequence variants of keratin genes associated with genodermatoses
    Yuyi Ying
    Lu Lu
    Santasree Banerjee
    Lizhen Xu
    Qiang Zhao
    Hao Wu
    Ruiqi Li
    Xiao Xu
    Hua Yu
    Dante Neculai
    Yongmei Xi
    Fan Yang
    Jiale Qin
    Chen Li
    Human Genomics, 14
  • [25] KVarPredDB: a database for predicting pathogenicity of missense sequence variants of keratin genes associated with genodermatoses
    Ying, Yuyi
    Lu, Lu
    Banerjee, Santasree
    Xu, Lizhen
    Zhao, Qiang
    Wu, Hao
    Li, Ruiqi
    Xu, Xiao
    Yu, Hua
    Neculai, Dante
    Xi, Yongmei
    Yang, Fan
    Qin, Jiale
    Li, Chen
    HUMAN GENOMICS, 2020, 14 (01)
  • [26] Mobile Interspersed Repeats Are Major Structural Variants in the Human Genome
    Huang, Cheng Ran Lisa
    Schneider, Anna M.
    Lu, Yunqi
    Niranjan, Tejasvi
    Shen, Peilin
    Robinson, Matoya A.
    Steranka, Jared P.
    Valle, David
    Civin, Curt I.
    Wang, Tao
    Wheelan, Sarah J.
    Ji, Hongkai
    Boeke, Jef D.
    Burns, Kathleen H.
    CELL, 2010, 141 (07) : 1171 - U129
  • [27] REVEL Is Better at Predicting Pathogenicity of Loss-of-Function than Gain-of-Function Variants
    Hopkins, Jasmin J.
    Wakeling, Matthew N.
    Johnson, Matthew B.
    Flanagan, Sarah E.
    Laver, Thomas W.
    HUMAN MUTATION, 2023, 2023
  • [28] Evaluating novel in silico tools for accurate pathogenicity classification in epilepsy-associated genetic missense variants
    Montanucci, Ludovica
    Bruenger, Tobias
    Bosselmann, Christian M.
    Ivaniuk, Alina
    Perez-Palma, Eduardo
    Lhatoo, Samden
    Leu, Costin
    Lal, Dennis
    EPILEPSIA, 2024, 65 (12) : 3655 - 3663
  • [29] Pathogenicity evaluation of variants of uncertain significance at exon-intron junction by splicing assay in patients with Mowat-Wilson syndrome
    Suzuki, Yasuyo
    Nomura, Noriko
    Yamada, Kenichiro
    Yamada, Yasukazu
    Fukuda, Ayumi
    Hoshino, Kyoko
    Abe, Shinpei
    Kurosawa, Kenji
    Inaba, Mie
    Mizuno, Seiji
    Wakamatsu, Nobuaki
    Hayashi, Shin
    EUROPEAN JOURNAL OF MEDICAL GENETICS, 2023, 66 (12)
  • [30] Molecular epidemiology, pathogenicity, and structural analysis of haemoglobin variants in the Yunnan province population of Southwestern China
    Zhang, Jie
    Li, Peng
    Yang, Yang
    Yan, Yuanlong
    Zeng, Xiaohong
    Li, Dongmei
    Chen, Hong
    Su, Jie
    Zhu, Baosheng
    SCIENTIFIC REPORTS, 2019, 9 (1)