Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins

被引:18
|
作者
Zhang, Jian [1 ]
Ghadermarzi, Sina [2 ]
Kurgan, Lukasz [2 ]
机构
[1] Xinyang Normal Univ, Sch Comp & Informat Technol, Xinyang 464000, Peoples R China
[2] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
MOLECULAR RECOGNITION FEATURES; INTRINSIC DISORDER; INTERACTION SITES; COMPUTATIONAL PREDICTION; MORFS; IDENTIFICATION; REGIONS; RNA; DNA; SERVER;
D O I
10.1093/bioinformatics/btaa573
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). Results: Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to crossover, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs.
引用
收藏
页码:4729 / 4738
页数:10
相关论文
共 49 条
  • [41] THPLM: a sequence-based deep learning framework for protein stability changes prediction upon point variations using pretrained protein language model
    Gong, Jianting
    Jiang, Lili
    Chen, Yongbing
    Zhang, Yixiang
    Li, Xue
    Ma, Zhiqiang
    Fu, Zhiguo
    He, Fei
    Sun, Pingping
    Ren, Zilin
    Tian, Mingyao
    BIOINFORMATICS, 2023, 39 (11)
  • [42] An improved sequence-based prediction protocol for protein-protein interactions using amino acids substitution matrix and rotation forest ensemble classifiers
    You, Zhu-Hong
    Li, Xiao
    Chan, Keith C. C.
    NEUROCOMPUTING, 2017, 228 : 277 - 282
  • [43] Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation
    Charoenkwan, Phasit
    Nantasenamat, Chanin
    Hasan, Md. Mehedi
    Shoombuatong, Watshara
    JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2020, 34 (10) : 1105 - 1116
  • [44] PCSPred&x005F;SC: Prediction of Protein Citrullination Sites Using an Effective Sequence-Based Combined Method
    Zhang, Lina
    Chen, Jingui
    Zhang, Chengjin
    Gao, Rui
    Yang, Runtao
    IEEE ACCESS, 2020, 8 : 88453 - 88463
  • [45] Sequence-based prediction of physicochemical interactions at protein functional sites using a function-and-interaction-annotated domain profile database
    Han, Min
    Song, Yifan
    Qian, Jiaqiang
    Ming, Dengming
    BMC BIOINFORMATICS, 2018, 19
  • [46] An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis
    Zou, Chuanxin
    Gong, Jiayu
    Li, Honglin
    BMC BIOINFORMATICS, 2013, 14
  • [47] Sequence Based Prediction of DNA-Binding Proteins Based on Hybrid Feature Selection Using Random Forest and Gaussian Naive Bayes
    Lou, Wangchao
    Wang, Xiaoqing
    Chen, Fan
    Chen, Yixiao
    Jiang, Bo
    Zhang, Hua
    PLOS ONE, 2014, 9 (01):
  • [48] MPLs-Pred: Predicting Membrane Protein-Ligand Binding Sites Using Hybrid Sequence-Based Features and Ligand-Specific Models
    Lu, Chang
    Liu, Zhe
    Zhang, Enju
    He, Fei
    Ma, Zhiqiang
    Wang, Han
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2019, 20 (13)
  • [49] iNucRes-ASSH: Identifying nucleic acid-binding residues in proteins by using self-attention-based structure-sequence hybrid neural network
    Zhang, Jun
    Chen, Qingcai
    Liu, Bin
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2024, 92 (03) : 395 - 410