Sequence-based prediction of protein interaction sites with an integrative method

被引:117
|
作者
Chen, Xue-Wen [1 ,2 ]
Jeong, Jong Cheol [1 ]
机构
[1] Univ Kansas, Informat & Telecommun Technol Ctr, Bioinformat & Computat Life Sci Lab, Lawrence, KS 66045 USA
[2] Univ Kansas, Dept Comp Sci & Elect Engn, Lawrence, KS 66045 USA
基金
美国国家科学基金会;
关键词
MOLECULAR CHAPERONE; SURFACE COMPLEMENTARITY; HYDROPHOBIC MOMENT; SUBSTRATE-BINDING; CRYSTAL-STRUCTURE; SOFT DOCKING; J-DOMAIN; RECOGNITION; CONSERVATION; MUTATIONS;
D O I
10.1093/bioinformatics/btp039
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Identification of protein interaction sites has significant impact on understanding protein function, elucidating signal transduction networks and drug design studies. With the exponentially growing protein sequence data, predictive methods using sequence information only for protein interaction site prediction have drawn increasing interest. In this article, we propose a predictive model for identifying protein interaction sites. Without using any structure data, the proposed method extracts a wide range of features from protein sequences. A random forest-based integrative model is developed to effectively utilize these features and to deal with the imbalanced data classification problem commonly encountered in binding site predictions. Results: We evaluate the predictive method using 2829 interface residues and 24 616 non-interface residues extracted from 99 polypeptide chains in the Protein Data Bank. The experimental results show that the proposed method performs significantly better than two other sequence-based predictive methods and can reliably predict residues involved in protein interaction sites. Furthermore, we apply the method to predict interaction sites and to construct three protein complexes: the DnaK molecular chaperone system, 1YUW and 1DKG, which provide new insight into the sequence function relationship. We show that the predicted interaction sites can be valuable as a first approach for guiding experimental methods investigating protein-protein interactions and localizing the specific interface residues.
引用
收藏
页码:585 / 591
页数:7
相关论文
共 50 条
  • [21] THPLM: a sequence-based deep learning framework for protein stability changes prediction upon point variations using pretrained protein language model
    Gong, Jianting
    Jiang, Lili
    Chen, Yongbing
    Zhang, Yixiang
    Li, Xue
    Ma, Zhiqiang
    Fu, Zhiguo
    He, Fei
    Sun, Pingping
    Ren, Zilin
    Tian, Mingyao
    BIOINFORMATICS, 2023, 39 (11)
  • [22] Review and comparative assessment of sequence-based predictors of protein-binding residues
    Zhang, Jian
    Kurgan, Lukasz
    BRIEFINGS IN BIOINFORMATICS, 2018, 19 (05) : 821 - 837
  • [23] CLPred: a sequence-based protein crystallization predictor using BLSTM neural network
    Xuan, Wenjing
    Liu, Ning
    Huang, Neng
    Li, Yaohang
    Wang, Jianxin
    BIOINFORMATICS, 2020, 36 : I709 - I717
  • [24] Identifying protein-protein interaction sites in transient complexes with temperature factor, sequence profile and accessible surface area
    Liu, Rong
    Jiang, Wenchao
    Zhou, Yanhong
    AMINO ACIDS, 2010, 38 (01) : 263 - 270
  • [25] RAPID: Fast and accurate sequence-based prediction of intrinsic disorder content on proteomic scale
    Yan, Jing
    Mizianty, Marcin J.
    Filipow, Paul L.
    Uversky, Vladimir N.
    Kurgan, Lukasz
    BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2013, 1834 (08): : 1671 - 1680
  • [26] A sequence-based method to predict the impact of regulatory variants using random forest
    Liu, Qiao
    Gan, Mingxin
    Jiang, Rui
    BMC SYSTEMS BIOLOGY, 2017, 11
  • [27] Sequence-based features that are determinant for tail-anchored membrane protein sorting in eukaryotes
    Fry, Michelle Y.
    Saladi, Shyam M.
    Cunha, Alexandre
    Clemons, William M., Jr.
    TRAFFIC, 2021, 22 (09) : 306 - 318
  • [28] A fast method to predict protein interaction sites from sequences
    Gallet, X
    Charloteaux, B
    Thomas, A
    Brasseur, R
    JOURNAL OF MOLECULAR BIOLOGY, 2000, 302 (04) : 917 - 926
  • [29] Using Hierarchical Hidden Markov Models to Perform Sequence-Based Classification of Protein Structure
    Shi, Jian-Yu
    Zhang, Yan-Ning
    2010 IEEE 10TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING PROCEEDINGS (ICSP2010), VOLS I-III, 2010, : 1789 - +
  • [30] Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk
    Zhou, Jian
    Theesfeld, Chandra L.
    Yao, Kevin
    Chen, Kathleen M.
    Wong, Aaron K.
    Troyanskaya, Olga G.
    NATURE GENETICS, 2018, 50 (08) : 1171 - +