RGN: Residue-Based Graph Attention and Convolutional Network for Protein-Protein Interaction Site Prediction

被引:26
作者
Wang, Shuang [1 ]
Chen, Wenqi [1 ]
Han, Peifu [1 ]
Li, Xue [1 ]
Song, Tao [1 ,2 ]
机构
[1] China Univ Petr, Coll Comp Sci & Technol, Qingdao 266580, Peoples R China
[2] Univ Politecn Madrid, Fac Comp Sci, Dept Artificial Intelligence, Madrid 28031, Spain
关键词
SECONDARY STRUCTURE; FINGERPRINTS;
D O I
10.1021/acs.jcim.2c01092
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
The prediction of a protein-protein interaction site (PPI site) plays a very important role in the biochemical process, and lots of computational methods have been proposed in the past. However, the majority of the past methods are time consuming and lack accuracy. Hence, coming up with an effective computational method is necessary. In this article, we present a novel computational model called RGN (residue-based graph attention and convolutional network) to predict PPI sites. In our paper, the protein is treated as a graph. The amino acid can be seen as the node in the graph structure. The position-specific scoring matrix, hidden Markov model, hydrogen bond estimation algorithm, and ProtBert are applied as node features. The edges are decided by the spatial distance between the amino acids. Then, we utilize a residue-based graph convolutional network and graph attention network to further extract the deeper feature. Finally, the processed node feature is fed into the prediction layer. We show the superiority of our model by comparing it with the other four protein structure-based methods and five protein sequence-based methods. Our model obtains the best performance on all the evaluation metrics (accuracy, precision, recall, F1 score, Matthews correlation coefficient, area under the receiver operating characteristic curve, and area under the precision recall curve). We also conduct a case study to demonstrate that extracting the protein information from the protein structure perspective is effective and points out the difficult aspect of PPI site prediction.
引用
收藏
页码:5961 / 5974
页数:14
相关论文
共 29 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs [J].
Auch, Alexander F. ;
Klenk, Hans-Peter ;
Goeker, Markus .
STANDARDS IN GENOMIC SCIENCES, 2010, 2 (01) :142-148
[3]   Algorithmic approaches to protein-protein interaction site prediction [J].
Aumentado-Armstrong, Tristan T. ;
Istrate, Bogdan ;
Murgita, Robert A. .
ALGORITHMS FOR MOLECULAR BIOLOGY, 2015, 10
[4]  
Bekkar M., 2013, J INF ENG APPL, V3, P27, DOI DOI 10.5121/IJDKP.2013.3402
[5]   Structure-aware protein solubility prediction from sequence through graph convolutional network and predicted contact map [J].
Chen, Jianwen ;
Zheng, Shuangjia ;
Zhao, Huiying ;
Yang, Yuedong .
JOURNAL OF CHEMINFORMATICS, 2021, 13 (01)
[6]  
Chen M, 2020, PR MACH LEARN RES, V119
[7]   DOMpro: Protein domain prediction using profiles, secondary structure, relative solvent accessibility, and recursive neural networks [J].
Cheng, Jianlin ;
Sweredoski, Michael J. ;
Baldi, Pierre .
DATA MINING AND KNOWLEDGE DISCOVERY, 2006, 13 (01) :1-10
[8]   Developing Computational Model to Predict Protein-Protein Interaction Sites Based on the XGBoost Algorithm [J].
Deng, Aijun ;
Zhang, Huan ;
Wang, Wenyan ;
Zhang, Jun ;
Fan, Dingdong ;
Chen, Peng ;
Wang, Bing .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (07)
[9]   Hidden Markov models [J].
Eddy, SR .
CURRENT OPINION IN STRUCTURAL BIOLOGY, 1996, 6 (03) :361-365
[10]   ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning [J].
Elnaggar, Ahmed ;
Heinzinger, Michael ;
Dallago, Christian ;
Rehawi, Ghalia ;
Wang, Yu ;
Jones, Llion ;
Gibbs, Tom ;
Feher, Tamas ;
Angerer, Christoph ;
Steinegger, Martin ;
Bhowmik, Debsindhu ;
Rost, Burkhard .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) :7112-7127