Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins

被引:18
|
作者
Zhang, Jian [1 ]
Ghadermarzi, Sina [2 ]
Kurgan, Lukasz [2 ]
机构
[1] Xinyang Normal Univ, Sch Comp & Informat Technol, Xinyang 464000, Peoples R China
[2] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
MOLECULAR RECOGNITION FEATURES; INTRINSIC DISORDER; INTERACTION SITES; COMPUTATIONAL PREDICTION; MORFS; IDENTIFICATION; REGIONS; RNA; DNA; SERVER;
D O I
10.1093/bioinformatics/btaa573
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). Results: Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to crossover, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs.
引用
收藏
页码:4729 / 4738
页数:10
相关论文
共 49 条
  • [31] PVP-SVM: Sequence-Based Prediction of Phage Virion Proteins Using a Support Vector Machine
    Manavalan, Balachandran
    Shin, Tae H.
    Lee, Gwang
    FRONTIERS IN MICROBIOLOGY, 2018, 9
  • [32] Simplified sequence-based method for ATP-binding prediction using contextual local evolutionary conservation
    Fang, Chun
    Noguchi, Tamotsu
    Yamana, Hayato
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2014, 9
  • [33] Sequence-based predictor of ATP-binding residues using random forest and mRMR-IFS feature selection
    Ma, Xin
    Sun, Xiao
    JOURNAL OF THEORETICAL BIOLOGY, 2014, 360 : 59 - 66
  • [34] Prediction of RNA-binding residues in proteins from primary sequence using an enriched random forest model with a novel hybrid feature
    Ma, Xin
    Guo, Jing
    Wu, Jiansheng
    Liu, Hongde
    Yu, Jiafeng
    Xie, Jianming
    Sun, Xiao
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2011, 79 (04) : 1230 - 1239
  • [35] Protein-DNA Binding Residue Prediction via Bagging Strategy and Sequence-Based Cube-Format Feature
    Hu, Jun
    Bai, Yan-Song
    Zheng, Lin-Lin
    Jia, Ning-Xin
    Yu, Dong-Jun
    Zhang, Gui-Jun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (06) : 3635 - 3645
  • [36] TargetDBP: Accurate DNA-Binding Protein Prediction Via Sequence-Based Multi-View Feature Learning
    Hu, Jun
    Zhou, Xiao-Gen
    Zhu, Yi-Heng
    Yu, Dong-Jun
    Zhang, Gui-Jun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (04) : 1419 - 1429
  • [37] Sequence-based B-cell epitope prediction by using associations in antibody-antigen structural complexes
    Zhao, Liang
    Li, Jinyan
    BIBMW: 2009 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOP, 2009, : 163 - 170
  • [38] Sequence-based Detection of DNA-binding Proteins using Multiple-View Features Allied with Feature Selection
    Zhou, Liling
    Song, Xiaoning
    Yu, Dong-Jun
    Sun, Jun
    MOLECULAR INFORMATICS, 2020, 39 (08)
  • [39] RNABindRPlus: A Predictor that Combines Machine Learning and Sequence Homology-Based Methods to Improve the Reliability of Predicted RNA-Binding Residues in Proteins
    Walia, Rasna R.
    Xue, Li C.
    Wilkins, Katherine
    El-Manzalawy, Yasser
    Dobbs, Drena
    Honavar, Vasant
    PLOS ONE, 2014, 9 (05):
  • [40] RBRDetector: Improved prediction of binding residues on RNA-binding protein structures using complementary feature- and template-based strategies
    Yang, Xiao-Xia
    Deng, Zhi-Luo
    Liu, Rong
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2014, 82 (10) : 2455 - 2471