Pathogenicity Prediction of Single Amino Acid Variants With Machine Learning Model Based on Protein Structural Energies

被引:1
|
作者
Wu, Tzu-Hsuan [1 ]
Lin, Peng-Chan [2 ]
Chou, Hsin-Hung [3 ]
Shen, Meng-Ru [4 ]
Hsieh, Sun-Yuan [1 ,5 ]
机构
[1] Natl Cheng Kung Univ, Inst Med Informat, Tainan 701, Taiwan
[2] Natl Cheng Kung Univ Hosp, Dept Comp Sci & Informat Engn, Dept Internal Med, Tainan 704, Taiwan
[3] Natl Chi Nan Univ, Dept Comp Sci & Informat Engn, Puli Township 54516, Nantou County, Taiwan
[4] Natl Cheng Kung Univ, Dept Obstet & Gynecol, Dept Pharmacol, Coll Med, Tainan 701, Taiwan
[5] Natl Cheng Kung Univ, Inst Mfg Informat Syst, Dept Comp Sci & Informat Engn, Tainan 701, Taiwan
关键词
Machine learning; pathogenicity prediction; protein structure energy; single amino acid variants; SNP; MUTATIONS; POLYMORPHISMS;
D O I
10.1109/TCBB.2021.3139048
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The most popular tools for predicting pathogenicity of single amino acid variants (SAVs) were developed based on sequence-based techniques. SAVs may change protein structure and function. In the context of van derWaals force and disulfide bridge calculations, no method directly predicts the impact of mutations on the energies of the protein structure. Here, we combined machine learning methods and energy scores of protein structures calculated by Rosetta Energy Function 2015 to predict SAV pathogenicity. The accuracy level of our model (0.76) is higher than that of six prediction tools. Further analyses revealed that the differential reference energies, attractive energies, and solvation of polar atoms between wildtype and mutant side-chains played essential roles in distinguishing benign from pathogenic variants. These features indicated the physicochemical properties of amino acids, which were observed in 3D structures instead of sequences. We added 16 features to Rhapsody (the prediction tool we used for our data set) and consequently improved its performance. The results indicated that these energy scores were more appropriate and more detailed representations of the pathogenicity of SAVs.
引用
收藏
页码:606 / 615
页数:10
相关论文
共 50 条
  • [1] LYRUS: a machine learning model for predicting the pathogenicity of missense variants
    Lai, Jiaying
    Yang, Jordan
    Gamsiz Uzun, Ece D.
    Rubenstein, Brenda M.
    Sarkar, Indra Neil
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [2] DTreePred: an online viewer based on machine learning for pathogenicity prediction of genomic variants
    Gomes, Daniel Henrique Ferreira
    Medeiros, Inacio Gomes
    Petta, Tirzah Braz
    Stransky, Beatriz
    de Souza, Jorge Estefano Santana
    BMC BIOINFORMATICS, 2025, 26 (01): : 101
  • [3] SVFX: a machine learning framework to quantify the pathogenicity of structural variants
    Kumar, Sushant
    Harmanci, Arif
    Vytheeswaran, Jagath
    Gerstein, Mark B.
    GENOME BIOLOGY, 2020, 21 (01) : 274
  • [4] Accurate prediction of functional effect of single amino acid variants with deep learning
    Derbel, Houssemeddine
    Zhao, Zhongming
    Liu, Qian
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 5776 - 5784
  • [5] PPVED: A machine learning tool for predicting the effect of single amino acid substitution on protein function in plants
    Gou, Xiangjian
    Feng, Xuanjun
    Shi, Haoran
    Guo, Tingting
    Xie, Rongqian
    Liu, Yaxi
    Wang, Qi
    Li, Hongxiang
    Yang, Banglie
    Chen, Lixue
    Lu, Yanli
    PLANT BIOTECHNOLOGY JOURNAL, 2022, 20 (07) : 1417 - 1431
  • [6] InMeRF: prediction of pathogenicity of missense variants by individual modeling for each amino acid substitution
    Takeda, Jun-ichi
    Nanatsue, Kentaro
    Yamagishi, Ryosuke
    Ito, Mikako
    Haga, Nobuhiko
    Hirata, Hiromi
    Ogi, Tomoo
    Ohno, Kinji
    NAR GENOMICS AND BIOINFORMATICS, 2020, 2 (02)
  • [7] Advancing Prediction of Pathogenicity of Familial Hypercholesterolemia LDL Receptor Commonest Variants With Machine Learning Models
    Santos, Raul D.
    JACC-BASIC TO TRANSLATIONAL SCIENCE, 2021, 6 (11): : 828 - 830
  • [8] Prediction of Thermostability of Enzymes Based on the Amino Acid Index (AAindex) Database and Machine Learning
    Li, Gaolin
    Jia, Lili
    Wang, Kang
    Sun, Tingting
    Huang, Jun
    MOLECULES, 2023, 28 (24):
  • [9] Prediction of protein-peptide-binding amino acid residues regions using machine learning algorithms
    Shafiee, Shima
    Fathi, Abdolhossein
    2021 26TH INTERNATIONAL COMPUTER CONFERENCE, COMPUTER SOCIETY OF IRAN (CSICC), 2021,
  • [10] PMTPred: machine-learning-based prediction of protein methyltransferases using the composition of k-spaced amino acid pairs
    Yadav, Arvind Kumar
    Gupta, Pradeep Kumar
    Singh, Tiratha Raj
    MOLECULAR DIVERSITY, 2024, 28 (04) : 2301 - 2315