Challenges in predicting stabilizing variations: An exploration

被引:13
作者
Benevenuta, Silvia [1 ]
Birolo, Giovanni [1 ]
Sanavia, Tiziana [1 ]
Capriotti, Emidio [2 ]
Fariselli, Piero [1 ]
机构
[1] Univ Torino, Dept Med Sci, Turin, Italy
[2] Univ Bologna, Dept Pharm & Biotechnol FaBiT, Bologna, Italy
关键词
protein stability; single-point mutation; stability predictors; machine learning; stabilizing variants; PROTEIN STABILITY; WEB SERVER; MUTATIONS; SEQUENCE;
D O I
10.3389/fmolb.2022.1075570
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
An open challenge of computational and experimental biology is understanding the impact of non-synonymous DNA variations on protein function and, subsequently, human health. The effects of these variants on protein stability can be measured as the difference in the free energy of unfolding (delta delta G) between the mutated structure of the protein and its wild-type form. Throughout the years, bioinformaticians have developed a wide variety of tools and approaches to predict the delta delta G. Although the performance of these tools is highly variable, overall they are less accurate in predicting delta delta G stabilizing variations rather than the destabilizing ones. Here, we analyze the possible reasons for this difference by focusing on the relationship between experimentally-measured delta delta G and seven protein properties on three widely-used datasets (S2648, VariBench, Ssym) and a recently introduced one (S669). These properties include protein structural information, different physical properties and statistical potentials. We found that two highly used input features, i.e., hydrophobicity and the Blosum62 substitution matrix, showa performance close to random choice when trying to separate stabilizing variants from either neutral or destabilizing ones. We then speculate that, since destabilizing variations are the most abundant class in the available datasets, the overall performance of the methods is higher when including features that improve the prediction for the destabilizing variants at the expense of the stabilizing ones. These findings highlight the need of designing predictive methods able to exploit also input features highly correlated with the stabilizing variants. New tools should also be tested on a not-artificially balanced dataset, reporting the performance on all the three classes (i.e., stabilizing, neutral and destabilizing variants) and not only the overall results.
引用
收藏
页数:10
相关论文
共 44 条
[1]   How to guarantee optimal stability for most representative structures in the protein data bank [J].
Bastolla, U ;
Farwer, J ;
Knapp, EW ;
Vendruscolo, M .
PROTEINS-STRUCTURE FUNCTION AND GENETICS, 2001, 44 (02) :79-96
[2]   An antisymmetric neural network to predict free energy changes in protein variants [J].
Benevenuta, S. ;
Pancotti, C. ;
Fariselli, P. ;
Birolo, G. ;
Sanavia, T. .
JOURNAL OF PHYSICS D-APPLIED PHYSICS, 2021, 54 (24)
[3]   On the Upper Bounds of the Real-Valued Predictions [J].
Benevenuta, Silvia ;
Fariselli, Piero .
BIOINFORMATICS AND BIOLOGY INSIGHTS, 2019, 13
[4]   Protein Stability Perturbation Contributes to the Loss of Function in Haploinsufficient Genes [J].
Birolo, Giovanni ;
Benevenuta, Silvia ;
Fariselli, Piero ;
Capriotti, Emidio ;
Giorgio, Elisa ;
Sanavia, Tiziana .
FRONTIERS IN MOLECULAR BIOSCIENCES, 2021, 8
[5]   I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure [J].
Capriotti, E ;
Fariselli, P ;
Casadio, R .
NUCLEIC ACIDS RESEARCH, 2005, 33 :W306-W310
[6]   A three-state prediction of single point mutations on protein stability changes [J].
Capriotti, Emidio ;
Fariselli, Piero ;
Rossi, Ivan ;
Casadio, Rita .
BMC BIOINFORMATICS, 2008, 9
[7]   Correlating Disease-Related Mutations to Their Effect on Protein Stability: A Large-Scale Analysis of the Human Proteome [J].
Casadio, Rita ;
Vassura, Marco ;
Tiwari, Shalinee ;
Fariselli, Piero ;
Martelli, Pier Luigi .
HUMAN MUTATION, 2011, 32 (10) :1161-1170
[8]   PremPS: Predicting the impact of missense mutations on protein stability [J].
Chen, Yuting ;
Lu, Haoyu ;
Zhang, Ning ;
Zhu, Zefeng ;
Wang, Shuqin ;
Li, Minghui .
PLOS COMPUTATIONAL BIOLOGY, 2020, 16 (12)
[9]   Prediction of protein stability changes for single-site mutations using support vector machines [J].
Cheng, JL ;
Randall, A ;
Baldi, P .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2006, 62 (04) :1125-1132
[10]   Prediction by Graph Theoretic Measures of Structural Effects in Proteins Arising from Non-Synonymous Single Nucleotide Polymorphisms [J].
Cheng, Tammy M. K. ;
Lu, Yu-En ;
Vendruscolo, Michele ;
Lio, Pietro ;
Blundell, Tom L. .
PLOS COMPUTATIONAL BIOLOGY, 2008, 4 (07)