Machine Learning Approaches for Protein-Protein Interaction Hot Spot Prediction: Progress and Comparative Assessment

被引:54
作者
Liu, Siyu [1 ]
Liu, Chuyao [1 ]
Deng, Lei [1 ]
机构
[1] Cent South Univ, Sch Software, Changsha 410075, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
hot spots; protein-protein interaction; machine learning; performance evaluation; AMINO-ACID; SOLVENT ACCESSIBILITY; BINDING-ENERGY; DATABASE; RESIDUES; SELECTION; CONSERVATION; INFORMATION; SEQUENCE; SERVER;
D O I
10.3390/molecules23102535
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Hot spots are the subset of interface residues that account for most of the binding free energy, and they play essential roles in the stability of protein binding. Effectively identifying which specific interface residues of protein-protein complexes form the hot spots is critical for understanding the principles of protein interactions, and it has broad application prospects in protein design and drug development. Experimental methods like alanine scanning mutagenesis are labor-intensive and time-consuming. At present, the experimentally measured hot spots are very limited. Hence, the use of computational approaches to predicting hot spots is becoming increasingly important. Here, we describe the basic concepts and recent advances of machine learning applications in inferring the protein-protein interaction hot spots, and assess the performance of widely used features, machine learning algorithms, and existing state-of-the-art approaches. We also discuss the challenges and future directions in the prediction of hot spots.
引用
收藏
页数:15
相关论文
共 80 条
[31]   Protein binding hot spots prediction from sequence only by a new ensemble learning method [J].
Hu, Shan-Shan ;
Chen, Peng ;
Wang, Bing ;
Li, Jinyan .
AMINO ACIDS, 2017, 49 (10) :1773-1785
[32]  
Huang QQ, 2016, IEEE INT C BIOINFORM, P1584, DOI 10.1109/BIBM.2016.7822756
[33]  
Irwin M., 1998, LEARNING GRAPHICAL M, P140
[34]   O-GlcNAcPRED-II: an integrated classification algorithm for identifying O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA oversampling technique [J].
Jia, Cangzhi ;
Zuo, Yun ;
Zou, Quan .
BIOINFORMATICS, 2018, 34 (12) :2029-2036
[35]   Prediction of Protein Hotspots from Whole Protein Sequences by a Random Projection Ensemble System [J].
Jiang, Jinjian ;
Wang, Nian ;
Chen, Peng ;
Zheng, Chunhou ;
Wang, Bing .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2017, 18 (07)
[36]   A series of PDB related databases for everyday needs [J].
Joosten, Robbie P. ;
Beek, Tim A. H. Te ;
Krieger, Elmar ;
Hekkelman, Maarten L. ;
Hooft, Rob W. W. ;
Schneider, Reinhard ;
Sander, Chris ;
Vriend, Gert .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D411-D419
[37]   AAindex: amino acid index database, progress report 2008 [J].
Kawashima, Shuichi ;
Pokarowski, Piotr ;
Pokarowska, Maria ;
Kolinski, Andrzej ;
Katayama, Toshiaki ;
Kanehisa, Minoru .
NUCLEIC ACIDS RESEARCH, 2008, 36 :D202-D205
[38]   Hot regions in protein-protein interactions: The organization and contribution of structurally conserved hot spot residues [J].
Keskin, O ;
Ma, BY ;
Nussinov, R .
JOURNAL OF MOLECULAR BIOLOGY, 2005, 345 (05) :1281-1294
[39]   A simple physical model for binding energy hot spots in protein-protein complexes [J].
Kortemme, T ;
Baker, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (22) :14116-14121
[40]  
Kortemme T., 2004, Sci. STKE, V2004, pPL2, DOI DOI 10.1126/STKE.2192004PL2