Predicting protein-peptide binding residues via interpretable deep learning

被引:36
|
作者
Wang, Ruheng [1 ,2 ]
Jin, Junru [1 ,2 ]
Zou, Quan [3 ]
Nakai, Kenta [4 ]
Wei, Leyi [1 ,2 ]
机构
[1] Shandong Univ, Sch Software, Jinan 250101, Peoples R China
[2] Shandong Univ, Joint SDU NTU Ctr Artificial Intelligence Res C F, Jinan 250101, Peoples R China
[3] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu 610054, Peoples R China
[4] Univ Tokyo, Inst Med Sci, Human Genome Ctr, Tokyo 1088639, Japan
基金
中国国家自然科学基金;
关键词
SEQUENCE-BASED PREDICTION; SITES; DNA;
D O I
10.1093/bioinformatics/btac352
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
A Summary: Identifying the protein-peptide binding residues is fundamentally important to understand the mechanisms of protein functions and explore drug discovery. Although several computational methods have been developed, most of them highly rely on third-party tools or complex data preprocessing for feature design, easily resulting in low computational efficacy and suffering from low predictive performance. To address the limitations, we propose PepBCL, a novel BERT (Bidirectional Encoder Representation from Transformers) -based contrastive learning framework to predict the protein-peptide binding residues based on protein sequences only. PepBCL is an end-to-end predictive model that is independent of feature engineering. Specifically, we introduce a well pre-trained protein language model that can automatically extract and learn high-latent representations of protein sequences relevant for protein structures and functions. Further, we design a novel contrastive learning module to optimize the feature representations of binding residues underlying the imbalanced dataset. We demonstrate that our proposed method significantly outperforms the state-of-the-art methods under benchmarking comparison, and achieves more robust performance. Moreover, we found that we further improve the performance via the integration of traditional features and our learnt features. Interestingly, the interpretable analysis of our model highlights the flexibility and adaptability of deep learning-based protein language model to capture both conserved and non-conserved sequential characteristics of peptide-binding residues. Finally, to facilitate the use of our method, we establish an online predictive platform as the implementation of the proposed PepBCL, which is now available at http://server.wei-group.net/PepBCL/.
引用
收藏
页码:3351 / 3360
页数:10
相关论文
共 50 条
  • [31] Protein-peptide interaction studies demonstrate the versatility of calmodulin target protein binding
    Ishida, Hiroaki
    Vogel, Hans J.
    PROTEIN AND PEPTIDE LETTERS, 2006, 13 (05): : 455 - 465
  • [32] Predicting protein-peptide interaction sites using distant protein complexes as structural templates
    Johansson-Akhe, Isak
    Mirabello, Claudio
    Wallner, Bjorn
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [33] Predicting protein-peptide interaction sites using distant protein complexes as structural templates
    Isak Johansson-Åkhe
    Claudio Mirabello
    Björn Wallner
    Scientific Reports, 9
  • [34] All-Atom Monte Carlo Approach to Protein-Peptide Binding
    Staneva, Iskra
    Wallin, Stefan
    JOURNAL OF MOLECULAR BIOLOGY, 2009, 393 (05) : 1118 - 1128
  • [35] Direct observation of protein-peptide folding and binding in the formation of ribonuclease S
    Lee, Yumin
    Ashwood, Brennan
    Wu, Yiheng
    Dhayalan, Balamurugan
    Gagnon, Isabelle
    Sosnick, Tobin R.
    Tokmakoff, Andrei
    BIOPHYSICAL JOURNAL, 2024, 123 (03) : 461A - 462A
  • [36] TPepPro: a deep learning model for predicting peptide-protein interactions
    Jin, Xiaohong
    Chen, Zimeng
    Yu, Dan
    Jiang, Qianhui
    Chen, Zhuobin
    Yan, Bin
    Qin, Jing
    Liu, Yong
    Wang, Junwen
    BIOINFORMATICS, 2024, 41 (01)
  • [37] Geometry based General Prediction Model of Protein-Peptide Binding Affinities
    Liu, Zhonghao
    Hu, Jianjun
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 1590 - 1597
  • [38] IDL-PPBopt: A Strategy for Prediction and Optimization of Human Plasma Protein Binding of Compounds via an Interpretable Deep Learning Method
    Lou, Chaofeng
    Yang, Hongbin
    Wang, Jiye
    Huang, Mengting
    Li, Weihua
    Liu, Guixia
    Lee, Philip W.
    Tang, Yun
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (11) : 2788 - 2799
  • [39] Machine Learning in Quantitative Protein-peptide Affinity Prediction: Implications for Therapeutic Peptide Design
    Li, Zhongyan
    Miao, Qingqing
    Yan, Fugang
    Meng, Yang
    Zhou, Pcng
    CURRENT DRUG METABOLISM, 2019, 20 (03) : 170 - 176
  • [40] Advances in the Prediction of Protein-Peptide Binding Affinities: Implications for Peptide-Based Drug Discovery
    Audie, Joseph
    Swanson, Jon
    CHEMICAL BIOLOGY & DRUG DESIGN, 2013, 81 (01) : 50 - 60