Evolutionary couplings and sequence variation effect predict protein binding sites

被引:15
|
作者
Schelling, Maria [1 ]
Hopf, Thomas A. [1 ,2 ,3 ]
Rost, Burkhard [1 ,4 ,5 ,6 ,7 ]
机构
[1] TUM, Dept Informat Bioinformat & Computat Biol i12, Boltzmannstr 3, D-85748 Garching, Germany
[2] Harvard Med Sch, Dept Syst Biol, Boston, MA USA
[3] Harvard Med Sch, Dept Cell Biol, Boston, MA USA
[4] TUM, IAS, Garching, Germany
[5] TUM, Sch Life Sci Weihenstephan WZW, Freising Weihenstephan, Germany
[6] Columbia Univ, Dept Biochem & Mol Biophys, New York, NY USA
[7] Columbia Univ, New York Consortium Membrane Prot Struct NYCOMP, New York, NY USA
关键词
binding site; coevolution; evolutionary couplings; machine learning; neural network; prediction; sequence variation;
D O I
10.1002/prot.25585
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Binding small ligands such as ions or macromolecules such as DNA, RNA, and other proteins is one important aspect of the molecular function of proteins. Many binding sites remain without experimental annotations. Predicting binding sites on a per-residue level is challenging, but if 3D structures are known, information about coevolving residue pairs (evolutionary couplings) can predict catalytic residues through mutual information. Here, we predicted protein binding sites from evolutionary couplings derived from a global statistical model using maximum entropy. Additionally, we included information from sequence variation. A simple method using a weighted sum over eight scores substantially outperformed random (F1 = 19.3% +/- 0.7% vs F1 = 2% for random). Training a neural network on these eight scores (along with predicted solvent accessibility and conservation in protein families) improved substantially (F1 = 26.2% +/- 0.8%). Although the machine learning was limited by the small data set and possibly wrong annotations of binding sites, the predicted binding sites formed spatial clusters in the protein. The source code of the binding site predictions is available through GitHub: .
引用
收藏
页码:1064 / 1074
页数:11
相关论文
共 50 条
  • [41] Predicted protein-protein interaction sites from local sequence information
    Ofran, Y
    Rost, B
    FEBS LETTERS, 2003, 544 (1-3) : 236 - 239
  • [42] Prediction of RNA binding sites in proteins from amino acid sequence
    Terribilini, Michael
    Lee, Jae-Hyung
    Yan, Changhui
    Jernigan, Robert L.
    Honavar, Vasant
    Dobbs, Drena
    RNA, 2006, 12 (08) : 1450 - 1462
  • [43] Binding-sites Prediction Assisting Protein-protein Docking
    Konc, Janez
    Trykowska Konc, Joanna
    Penca, Matej
    Janezic, Dusanka
    ACTA CHIMICA SLOVENICA, 2011, 58 (03) : 396 - 401
  • [44] Prediction of zinc binding sites in proteins using sequence derived information
    Srivastava, Abhishikha
    Kumar, Manish
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2018, 36 (16) : 4413 - 4423
  • [45] Protein embeddings predict binding residues in disordered regions
    Jahn, Laura R.
    Marquet, Celine
    Heinzinger, Michael
    Rost, Burkhard
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [46] Prediction of protein mononucleotide binding sites using AlphaFold2 and machine learning
    Yamaguchi, Shohei
    Nakashima, Haruka
    Moriwaki, Yoshitaka
    Terada, Tohru
    Shimizu, Kentaro
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2022, 100
  • [47] A Review About RNA-Protein-Binding Sites Prediction Based on Deep Learning
    Yan, Jianrong
    Zhu, Min
    IEEE ACCESS, 2020, 8 : 150929 - 150944
  • [48] SeBPPI: A Sequence-Based Protein-Protein Binding Predictor
    Wang, Bo
    Mao, Jun
    Wei, Min
    Qi, Yifei
    Zhang, John Z. H.
    JOURNAL OF COMPUTATIONAL BIOPHYSICS AND CHEMISTRY, 2022, 21 (06): : 729 - 737
  • [49] Locating ligand binding sites in G-protein coupled receptors using combined information from docking and sequence conservation
    Vidad, Ashley Ryan
    Macaspac, Stephen
    Ng, Ho Leung
    PEERJ, 2021, 9
  • [50] RBProkCNN: Deep learning on appropriate contextual evolutionary information for RNA binding protein discovery in prokaryotes
    Pradhan, Upendra Kumar
    Naha, Sanchita
    Das, Ritwika
    Gupta, Ajit
    Parsad, Rajender
    Meher, Prabina Kumar
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2024, 23 : 1631 - 1640