Evolutionary couplings and sequence variation effect predict protein binding sites

被引:15
|
作者
Schelling, Maria [1 ]
Hopf, Thomas A. [1 ,2 ,3 ]
Rost, Burkhard [1 ,4 ,5 ,6 ,7 ]
机构
[1] TUM, Dept Informat Bioinformat & Computat Biol i12, Boltzmannstr 3, D-85748 Garching, Germany
[2] Harvard Med Sch, Dept Syst Biol, Boston, MA USA
[3] Harvard Med Sch, Dept Cell Biol, Boston, MA USA
[4] TUM, IAS, Garching, Germany
[5] TUM, Sch Life Sci Weihenstephan WZW, Freising Weihenstephan, Germany
[6] Columbia Univ, Dept Biochem & Mol Biophys, New York, NY USA
[7] Columbia Univ, New York Consortium Membrane Prot Struct NYCOMP, New York, NY USA
关键词
binding site; coevolution; evolutionary couplings; machine learning; neural network; prediction; sequence variation;
D O I
10.1002/prot.25585
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Binding small ligands such as ions or macromolecules such as DNA, RNA, and other proteins is one important aspect of the molecular function of proteins. Many binding sites remain without experimental annotations. Predicting binding sites on a per-residue level is challenging, but if 3D structures are known, information about coevolving residue pairs (evolutionary couplings) can predict catalytic residues through mutual information. Here, we predicted protein binding sites from evolutionary couplings derived from a global statistical model using maximum entropy. Additionally, we included information from sequence variation. A simple method using a weighted sum over eight scores substantially outperformed random (F1 = 19.3% +/- 0.7% vs F1 = 2% for random). Training a neural network on these eight scores (along with predicted solvent accessibility and conservation in protein families) improved substantially (F1 = 26.2% +/- 0.8%). Although the machine learning was limited by the small data set and possibly wrong annotations of binding sites, the predicted binding sites formed spatial clusters in the protein. The source code of the binding site predictions is available through GitHub: .
引用
收藏
页码:1064 / 1074
页数:11
相关论文
共 50 条
  • [31] Identifying protein binding sites and optimal ligands
    Beuscher, A
    Olson, AJ
    Goodsell, DS
    LETTERS IN DRUG DESIGN & DISCOVERY, 2005, 2 (06) : 483 - 489
  • [32] Alternative evolutionary histories in the sequence space of an ancient protein
    Starr, Tyler N.
    Picton, Lora K.
    Hornton, Joseph W. T.
    NATURE, 2017, 549 (7672) : 409 - +
  • [33] Phosphate binding sites identification in protein structures
    Parca, Luca
    Gherardini, Pier Federico
    Helmer-Citterich, Manuela
    Ausiello, Gabriele
    NUCLEIC ACIDS RESEARCH, 2011, 39 (04) : 1231 - 1242
  • [34] Predicting Protein-Protein Interaction Sites by Rotation Forests with Evolutionary Information
    Hu, Xinying
    Jing, Anqi
    Du, Xiuquan
    INTELLIGENT COMPUTING IN BIOINFORMATICS, 2014, 8590 : 271 - 279
  • [35] Analysis of variation at transcription factor binding sites in Drosophila and humans
    Spivakov, Mikhail
    Akhtar, Junaid
    Kheradpour, Pouya
    Beal, Kathryn
    Girardot, Charles
    Koscielny, Gautier
    Herrero, Javier
    Kellis, Manolis
    Furlong, Eileen E. M.
    Birney, Ewan
    GENOME BIOLOGY, 2012, 13 (09):
  • [36] Relating the shape of protein binding sites to binding affinity profiles: is there an association?
    Simon, Zoltan
    Vigh-Smeller, Margit
    Peragovics, Agnes
    Csukly, Gabor
    Zahoranszky-Kohalmi, Gergely
    Rauscher, Anna A.
    Jelinek, Balazs
    Hari, Peter
    Bitter, Istvan
    Malnasi-Csizmadia, Andras
    Czobor, Pal
    BMC STRUCTURAL BIOLOGY, 2010, 10
  • [38] Identification of DNA-protein Binding Sites through Multi-Scale Local Average Blocks on Sequence Information
    Shen, Cong
    Ding, Yijie
    Tang, Jijun
    Song, Jian
    Guo, Fei
    MOLECULES, 2017, 22 (12):
  • [39] Application of Machine Learning Techniques to Predict Protein Phosphorylation Sites
    Zhang, Shengli
    Li, Xian
    Fan, Chengcheng
    Wu, Zhehui
    Liu, Qian
    LETTERS IN ORGANIC CHEMISTRY, 2019, 16 (04) : 247 - 257
  • [40] Spatial clustering of protein binding sites for template based protein docking
    Ghoorah, Anisah W.
    Devignes, Marie-Dominique
    Smail-Tabbone, Malika
    Ritchie, David W.
    BIOINFORMATICS, 2011, 27 (20) : 2820 - 2827