mvPPT: A Highly Efficient and Sensitive Pathogenicity Prediction Tool for Missense Variants

被引:4
作者
Tong, Shi-Yuan [1 ]
Fan, Ke [1 ]
Zhou, Zai-Wei [2 ]
Liu, Lin-Yun [1 ]
Zhang, Shu-Qing [1 ]
Fu, Yinghui [1 ]
Wang, Guang-Zhong [3 ]
Zhu, Ying [4 ]
Yu, Yong-Chun [1 ]
机构
[1] Fudan Univ, Jingan Dist Cent Hosp Shanghai, Inst Brain Sci, MOE Frontiers Ctr Brain Sci,State Key Lab Med Neur, Shanghai 200032, Peoples R China
[2] Shanghai Xunyin Biotechnol Co Ltd, Shanghai 201802, Peoples R China
[3] Univ Chinese Acad Sci, Chinese Acad Sci, Shanghai Inst Nutr & Hlth, CAS Key Lab Computat Biol, Shanghai 200031, Peoples R China
[4] Fudan Univ, MOE Frontiers Ctr Brain Sci, Inst Brain Sci, State Key Lab Med Neurobiol,Huashan Hosp, Shanghai 200032, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Machine learning; Missense variant; Genomics; Computational biology; Pathogenicity prediction; FUNCTIONAL IMPACT; DATABASE; MUTATIONS; DIAGNOSIS; ELEMENTS;
D O I
10.1016/j.gpb.2022.07.005
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Next-generation sequencing technologies both boost the discovery of variants in the human genome and exacerbate the challenges of pathogenic variant identification. In this study, we developed Pathogenicity Prediction Tool for missense variants (mvPPT), a highly sensitive and accurate missense variant classifier based on gradient boosting. mvPPT adopts high-confidence training sets with a wide spectrum of variant profiles, and extracts three categories of features, including scores from existing prediction tools, frequencies (allele frequencies, amino acid frequencies, and genotype frequencies), and genomic context. Compared with established predictors, mvPPT achieves superior performance in all test sets, regardless of data source. In addition, our study also provides guidance for training set and feature selection strategies, as well as reveals highly relevant features, which may further provide biological insights into variant pathogenicity. mvPPT is freely available at http://www.mvppt.club/.
引用
收藏
页码:414 / 426
页数:13
相关论文
共 50 条
  • [31] Phenotypic diversity, disease progression, and pathogenicity of MVK missense variants in mevalonic aciduria
    Brennenstuhl, Heiko
    Nashawi, Mohammed
    Schroeter, Julian
    Baronio, Federico
    Beedgen, Lars
    Gleich, Florian
    Jeltsch, Kathrin
    von Landenberg, Christina
    Martini, Silvia
    Simon, Anna
    Thiel, Christian
    Tsiakas, Konstantinos
    Opladen, Thomas
    Koelker, Stefan
    Hoffmann, Georg F.
    Haas, Dorothea
    JOURNAL OF INHERITED METABOLIC DISEASE, 2021, 44 (05) : 1272 - 1287
  • [32] Gene-specific machine learning for pathogenicity prediction of rare BRCA1 and BRCA2 missense variants
    Kang, Moonjong
    Kim, Seonhwa
    Lee, Da-Bin
    Hong, Changbum
    Hwang, Kyu-Baek
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [33] Comparison of pathogenicity prediction tools on missense variants in RYR1 and CACNA1S associated with malignant hyperthermia
    Schiemann, A. H.
    Stowell, K. M.
    BRITISH JOURNAL OF ANAESTHESIA, 2016, 117 (01) : 124 - 128
  • [34] Pathogenicity Prediction of Single Amino Acid Variants With Machine Learning Model Based on Protein Structural Energies
    Wu, Tzu-Hsuan
    Lin, Peng-Chan
    Chou, Hsin-Hung
    Shen, Meng-Ru
    Hsieh, Sun-Yuan
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (01) : 606 - 615
  • [35] Novel gene-specific Bayesian Gaussian mixture model to predict the missense variants pathogenicity of Sanfilippo syndrome
    Mohammed, Eman E. A.
    Fayez, Alaaeldin G.
    Abdelfattah, Nabil M.
    Fateen, Ekram
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [36] Identification of novel variants in carbamoyl phosphate synthetase 1 gene and comparative pathogenicity assessments of CPS1 missense variants following ACMG/AMP-ClinGen recommendation for computational tools
    Li, Fei
    Cai, Qin
    Ji, Wei
    Xu, Miao
    Tian, Guoli
    Zeng, Fanyi
    MOLECULAR GENETICS AND METABOLISM REPORTS, 2025, 43
  • [37] Pathogenicity prediction of non-synonymous single nucleotide variants in dilated cardiomyopathy
    Mueller, Sabine C.
    Backes, Christina
    Haas, Jan
    Katus, Hugo A.
    Meder, Benjamin
    Meese, Eckart
    Keller, Andreas
    BRIEFINGS IN BIOINFORMATICS, 2015, 16 (05) : 769 - 779
  • [38] Insights into the pathogenicity of missense variants in the forkhead domain of FOX proteins underlying Mendelian disorders
    Bermudez-Guzman, Luis
    Veitia, Reiner A.
    HUMAN GENETICS, 2021, 140 (07) : 999 - 1010
  • [39] Channeling the Future of Pathogenicity Prediction for Genetic Variants in Epilepsy
    Wagnon, Jacy L.
    EPILEPSY CURRENTS, 2023, 23 (02) : 118 - 120
  • [40] KVarPredDB: a database for predicting pathogenicity of missense sequence variants of keratin genes associated with genodermatoses
    Yuyi Ying
    Lu Lu
    Santasree Banerjee
    Lizhen Xu
    Qiang Zhao
    Hao Wu
    Ruiqi Li
    Xiao Xu
    Hua Yu
    Dante Neculai
    Yongmei Xi
    Fan Yang
    Jiale Qin
    Chen Li
    Human Genomics, 14