PANDA: Predicting the change in proteins binding affinity upon mutations by finding a signal in primary structures

被引:3
作者
Abbasi, Wajid Arshad [1 ]
Abbas, Syed Ali [1 ]
Andleeb, Saiqa [2 ]
机构
[1] Univ Azad Jammu & Kashmir, Dept Comp Sci & Informat Technol, Computat Biol & Data Anal Lab, King Abdullah Campus, Muzaffarabad 13100, Aj&k, Pakistan
[2] Univ Azad Jammu & Kashmir, Dept Zool, Biotechnol Lab, King Abdullah Campus, Muzaffarabad 13100, Aj&k, Pakistan
关键词
Protein-protein interaction; protein sequence analysis; machine learning; web services; mutational analysis; binding affinity; FREE-ENERGY; WEB-SERVER; STABILITY; COMPLEXES; PROTEOME;
D O I
10.1142/S0219720021500153
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Accurately determining a change in protein binding affinity upon mutations is important to find novel therapeutics and to assist mutagenesis studies. Determination of change in binding affinity upon mutations requires sophisticated, expensive, and time-consuming wet-lab experiments that can be supported with computational methods. Most of the available computational prediction techniques depend upon protein structures that bound their applicability to only protein complexes with recognized 3D structures. In this work, we explore the sequence-based prediction of change in protein binding affinity upon mutation and question the effectiveness of K-fold cross-validation (CV) across mutations adopted in previous studies to assess the generalization ability of such predictors with no known mutation during training. We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the change in protein binding affinity upon mutation. Our proposed sequence-based novel change in protein binding affinity predictor called PANDA performs comparably to the existing methods gauged through an appropriate CV scheme and an external independent test dataset. On an external test dataset, our proposed method gives a maximum Pearson correlation coefficient of 0.52 in comparison to the state-of-the-art existing protein structure-based method called MutaBind which gives a maximum Pearson correlation coefficient of 0.59. Our proposed protein sequence-based method, to predict a change in binding affinity upon mutations, has wide applicability and comparable performance in comparison to existing protein structure-based methods. We made PANDA easily accessible through a cloudbased webserver and python code available at https://sites.google.com/view/wajidarshad/ software and https://github.com/wajidarshad/panda, respectively.
引用
收藏
页数:19
相关论文
共 58 条
[31]   ProAffiMuSeq: sequence-based method to predict the binding free energy change of protein-protein complexes upon mutation using functional classification [J].
Jemimah, Sherlyn ;
Sekijima, Masakazu ;
Gromiha, M. Michael .
BIOINFORMATICS, 2020, 36 (06) :1725-1730
[32]   PROXiMATE: a database of mutant protein-protein complex thermodynamics and kinetics [J].
Jemimah, Sherlyn ;
Yugandhar, K. ;
Gromiha, M. Michael .
BIOINFORMATICS, 2017, 33 (17) :2787-2788
[33]   On the binding affinity of macromolecular interactions: daring to ask why proteins interact [J].
Kastritis, Panagiotis L. ;
Bonvin, Alexandre M. J. J. .
JOURNAL OF THE ROYAL SOCIETY INTERFACE, 2013, 10 (79)
[34]  
Leslie Christina, 2002, Pac Symp Biocomput, P564
[35]   Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: Cyscore as a case study [J].
Li, Hongjian ;
Leung, Kwong-Sak ;
Wong, Man-Hon ;
Ballester, Pedro J. .
BMC BIOINFORMATICS, 2014, 15 :1-12
[36]   MutaBind estimates and interprets the effects of sequence variants on protein-protein interactions [J].
Li, Minghui ;
Simonetti, Franco L. ;
Goncearenco, Alexander ;
Panchenko, Anna R. .
NUCLEIC ACIDS RESEARCH, 2016, 44 (W1) :W494-W501
[37]   PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence [J].
Li, Z. R. ;
Lin, H. H. ;
Han, L. Y. ;
Jiang, L. ;
Chen, X. ;
Chen, Y. Z. .
NUCLEIC ACIDS RESEARCH, 2006, 34 :W32-W37
[38]   PaPI: pseudo amino acid composition to score human protein-coding variants [J].
Limongelli, Ivan ;
Marini, Simone ;
Bellazzi, Riccardo .
BMC BIOINFORMATICS, 2015, 16
[39]   Amino acid composition predicts prion activity [J].
Minhas, Fayyaz ul Amir Afsar ;
Ross, Eric D. ;
Ben-Hur, Asa .
PLOS COMPUTATIONAL BIOLOGY, 2017, 13 (04)
[40]   Multiple instance learning of Calmodulin binding sites [J].
Minhas, Fayyaz ul Amir Afsar ;
Ben-Hur, Asa .
BIOINFORMATICS, 2012, 28 (18) :I416-I422