Predicting protein-peptide binding residues via interpretable deep learning

被引:36
|
作者
Wang, Ruheng [1 ,2 ]
Jin, Junru [1 ,2 ]
Zou, Quan [3 ]
Nakai, Kenta [4 ]
Wei, Leyi [1 ,2 ]
机构
[1] Shandong Univ, Sch Software, Jinan 250101, Peoples R China
[2] Shandong Univ, Joint SDU NTU Ctr Artificial Intelligence Res C F, Jinan 250101, Peoples R China
[3] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu 610054, Peoples R China
[4] Univ Tokyo, Inst Med Sci, Human Genome Ctr, Tokyo 1088639, Japan
基金
中国国家自然科学基金;
关键词
SEQUENCE-BASED PREDICTION; SITES; DNA;
D O I
10.1093/bioinformatics/btac352
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A Summary: Identifying the protein-peptide binding residues is fundamentally important to understand the mechanisms of protein functions and explore drug discovery. Although several computational methods have been developed, most of them highly rely on third-party tools or complex data preprocessing for feature design, easily resulting in low computational efficacy and suffering from low predictive performance. To address the limitations, we propose PepBCL, a novel BERT (Bidirectional Encoder Representation from Transformers) -based contrastive learning framework to predict the protein-peptide binding residues based on protein sequences only. PepBCL is an end-to-end predictive model that is independent of feature engineering. Specifically, we introduce a well pre-trained protein language model that can automatically extract and learn high-latent representations of protein sequences relevant for protein structures and functions. Further, we design a novel contrastive learning module to optimize the feature representations of binding residues underlying the imbalanced dataset. We demonstrate that our proposed method significantly outperforms the state-of-the-art methods under benchmarking comparison, and achieves more robust performance. Moreover, we found that we further improve the performance via the integration of traditional features and our learnt features. Interestingly, the interpretable analysis of our model highlights the flexibility and adaptability of deep learning-based protein language model to capture both conserved and non-conserved sequential characteristics of peptide-binding residues. Finally, to facilitate the use of our method, we establish an online predictive platform as the implementation of the proposed PepBCL, which is now available at http://server.wei-group.net/PepBCL/.
引用
收藏
页码:3351 / 3360
页数:10
相关论文
共 50 条
  • [21] Deep-learning-based prediction framework for protein-peptide interactions with structure generation pipeline
    Ge, Jingxuan
    Jiang, Dejun
    Sun, Huiyong
    Kang, Yu
    Pan, Peichen
    Deng, Yafeng
    Hsieh, Chang-Yu
    Hou, Tingjun
    CELL REPORTS PHYSICAL SCIENCE, 2024, 5 (06):
  • [22] A FAST AND ACCURATE EMPIRICAL EXPRESSION FOR PREDICTING PROTEIN-PROTEIN AND PROTEIN-PEPTIDE INTERACTIONS
    Audie, J.
    Audie, D.
    Boyd, C.
    BIOPOLYMERS, 2009, 92 (04) : 358 - 358
  • [23] Study of Data Imbalanced Problem in Protein-peptide Binding Prediction
    Gao, Lu
    Siu, Shirley W. I.
    PROCEEDINGS OF 2020 12TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL TECHNOLOGY, ICBBT 2020, 2020, : 61 - 66
  • [24] PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features
    Chandra, Abel
    Sharma, Alok
    Dehzangi, Iman
    Tsunoda, Tatsuhiko
    Sattar, Abdul
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [25] PepCNN deep learning tool for predicting peptide binding residues in proteins using sequence, structural, and language model features
    Abel Chandra
    Alok Sharma
    Iman Dehzangi
    Tatsuhiko Tsunoda
    Abdul Sattar
    Scientific Reports, 13
  • [26] A deep attention model for wide-genome protein-peptide binding affinity prediction at a sequence level
    Sun, Xiaohan
    Wu, Zhixiang
    Su, Jingjie
    Li, Chunhua
    INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2024, 276
  • [27] Efficient molecular dynamics simulations of protein-peptide binding kinetics
    Zwier, Matthew C.
    Kaus, Joseph W.
    Bhatt, Divesh
    Adelman, Joshua L.
    Grabe, Michael
    Zuckerman, Daniel M.
    Chong, Lillian T.
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2012, 244
  • [28] Predicting protein-ligand binding residues with deep convolutional neural networks
    Cui, Yifeng
    Dong, Qiwen
    Hong, Daocheng
    Wang, Xikun
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [29] Predicting protein-ligand binding residues with deep convolutional neural networks
    Yifeng Cui
    Qiwen Dong
    Daocheng Hong
    Xikun Wang
    BMC Bioinformatics, 20
  • [30] Predicting protein phosphorylation sites in soybean using interpretable deep tabular learning network
    Khalili, Elham
    Ramazi, Shahin
    Ghanati, Faezeh
    Kouchaki, Samaneh
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)