Assessing computational tools for predicting protein stability changes upon missense mutations using a new dataset

被引:6
作者
Zheng, Feifan [1 ]
Liu, Yang [1 ]
Yang, Yan [1 ]
Wen, Yuhao [1 ]
Li, Minghui [1 ,2 ]
机构
[1] Soochow Univ, Suzhou Med Coll, Sch Biol & Basic Med Sci, MOE Key Lab Geriatr Dis & Immunol, Suzhou, Peoples R China
[2] Soochow Univ, Sch Biol & Basic Med Sci, MOE Key Lab Geriatr Dis & Immunol, Suzhou Med Coll, Suzhou 215123, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
computational tools; missense mutations; protein stability changes; stabilizing mutations; CHALLENGES; SEQUENCE; IMPACT;
D O I
10.1002/pro.4861
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Insight into how mutations affect protein stability is crucial for protein engineering, understanding genetic diseases, and exploring protein evolution. Numerous computational methods have been developed to predict the impact of amino acid substitutions on protein stability. Nevertheless, comparing these methods poses challenges due to variations in their training data. Moreover, it is observed that they tend to perform better at predicting destabilizing mutations than stabilizing ones. Here, we meticulously compiled a new dataset from three recently published databases: ThermoMutDB, FireProtDB, and ProThermDB. This dataset, which does not overlap with the well-established S2648 dataset, consists of 4038 single-point mutations, including over 1000 stabilizing mutations. We assessed these mutations using 27 computational methods, including the latest ones utilizing mega-scale stability datasets and transfer learning. We excluded entries with overlap or similarity to training datasets to ensure fairness. Pearson correlation coefficients for the tested tools ranged from 0.20 to 0.53 on unseen data, and none of the methods could accurately predict stabilizing mutations, even those performing well in anti-symmetric property analysis. While most methods present consistent trends for predicting destabilizing mutations across various properties such as solvent exposure and secondary conformation, stabilizing mutations do not exhibit a clear pattern. Our study also suggests that solely addressing training dataset bias may not significantly enhance accuracy of predicting stabilizing mutations. These findings emphasize the importance of developing precise predictive methods for stabilizing mutations.
引用
收藏
页数:17
相关论文
共 57 条
  • [1] Data set and fitting dependencies when estimating protein mutant stability: Toward simple, balanced, and interpretable models
    Baek, Kristoffer T.
    Kepp, Kasper P.
    [J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2022, 43 (08) : 504 - 518
  • [2] ConSurf-DB: An accessible repository for the evolutionary conservation patterns of the majority of PDB proteins
    Ben Chorin, Adi
    Masrati, Gal
    Kessel, Amit
    Narunsky, Aya
    Sprinzak, Josef
    Lahav, Shlomtzion
    Ashkenazy, Haim
    Ben-Tal, Nir
    [J]. PROTEIN SCIENCE, 2020, 29 (01) : 258 - 267
  • [3] An antisymmetric neural network to predict free energy changes in protein variants
    Benevenuta, S.
    Pancotti, C.
    Fariselli, P.
    Birolo, G.
    Sanavia, T.
    [J]. JOURNAL OF PHYSICS D-APPLIED PHYSICS, 2021, 54 (24)
  • [4] Challenges in predicting stabilizing variations: An exploration
    Benevenuta, Silvia
    Birolo, Giovanni
    Sanavia, Tiziana
    Capriotti, Emidio
    Fariselli, Piero
    [J]. FRONTIERS IN MOLECULAR BIOSCIENCES, 2023, 9
  • [5] On the Upper Bounds of the Real-Valued Predictions
    Benevenuta, Silvia
    Fariselli, Piero
    [J]. BIOINFORMATICS AND BIOLOGY INSIGHTS, 2019, 13
  • [6] The Protein Data Bank
    Berman, HM
    Westbrook, J
    Feng, Z
    Gilliland, G
    Bhat, TN
    Weissig, H
    Shindyalov, IN
    Bourne, PE
    [J]. NUCLEIC ACIDS RESEARCH, 2000, 28 (01) : 235 - 242
  • [7] Rapid protein stability prediction using deep learning representations
    Blaabjerg, Lasse M.
    Kassem, Maher M.
    Good, Lydia L.
    Jonsson, Nicolas
    Cagiada, Matteo
    Johansson, Kristoffer E.
    Boomsma, Wouter
    Stein, Amelie
    Lindorff-Larsen, Kresten
    [J]. ELIFE, 2023, 12
  • [8] Systematic Investigation of the Data Set Dependency of Protein Stability Predictors
    Caldararu, Octav
    Mehra, Rukmankesh
    Blundell, Tom L.
    Kepp, Kasper P.
    [J]. JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2020, 60 (10) : 4772 - 4784
  • [9] I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure
    Capriotti, E
    Fariselli, P
    Casadio, R
    [J]. NUCLEIC ACIDS RESEARCH, 2005, 33 : W306 - W310
  • [10] Correlating Disease-Related Mutations to Their Effect on Protein Stability: A Large-Scale Analysis of the Human Proteome
    Casadio, Rita
    Vassura, Marco
    Tiwari, Shalinee
    Fariselli, Piero
    Martelli, Pier Luigi
    [J]. HUMAN MUTATION, 2011, 32 (10) : 1161 - 1170