Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins

被引:18
|
作者
Zhang, Jian [1 ]
Ghadermarzi, Sina [2 ]
Kurgan, Lukasz [2 ]
机构
[1] Xinyang Normal Univ, Sch Comp & Informat Technol, Xinyang 464000, Peoples R China
[2] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
MOLECULAR RECOGNITION FEATURES; INTRINSIC DISORDER; INTERACTION SITES; COMPUTATIONAL PREDICTION; MORFS; IDENTIFICATION; REGIONS; RNA; DNA; SERVER;
D O I
10.1093/bioinformatics/btaa573
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). Results: Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to crossover, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs.
引用
收藏
页码:4729 / 4738
页数:10
相关论文
共 49 条
  • [21] MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins
    Disfani, Fatemeh Miri
    Hsu, Wei-Lun
    Mizianty, Marcin J.
    Oldfield, Christopher J.
    Xue, Bin
    Dunker, A. Keith
    Uversky, Vladimir N.
    Kurgan, Lukasz
    BIOINFORMATICS, 2012, 28 (12) : I75 - I83
  • [22] A Sequence-Based Prediction Model of Vesicular Transport Proteins Using Ensemble Deep Learning
    Le, Nguyen Quoc Khanh
    Kha, Quang Hien
    14TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, BCB 2023, 2023,
  • [23] A Sequence-Based Dynamic Ensemble Learning System for Protein Ligand-Binding Site Prediction
    Chen, Peng
    Hu, ShanShan
    Zhang, Jun
    Gao, Xin
    Li, Jinyan
    Xia, Junfeng
    Wang, Bing
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) : 901 - 912
  • [24] Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection
    Ma, Xin
    Guo, Jing
    Sun, Xiao
    BIOMED RESEARCH INTERNATIONAL, 2015, 2015
  • [25] Improving the prediction of protein-nucleic acids binding residues via multiple sequence profiles and the consensus of complementary methods
    Su, Hong
    Liu, Mengchen
    Sun, Saisai
    Peng, Zhenling
    Yang, Jianyi
    BIOINFORMATICS, 2019, 35 (06) : 930 - 936
  • [26] Prediction of microRNA-binding residues in protein using a Laplacian support vector machine based on sequence information
    Ma, Xin
    Guo, Jing
    Sun, Xiao
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2018, 16 (03)
  • [27] Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences
    Park, Yungki
    BMC BIOINFORMATICS, 2009, 10 : 419
  • [28] A sequence-based prediction of Kruppel-like factors proteins using XGBoost and optimized features
    Le Nguyen Quoc Khanh
    Do Duyen Thi
    Nguyen Trinh-Trung-Duong
    Le Quynh Anh
    GENE, 2021, 787
  • [29] VacPred: Sequence-based prediction of plant vacuole proteins using machine-learning techniques
    Yadav, Arvind Kumar
    Singla, Deepak
    JOURNAL OF BIOSCIENCES, 2020, 45 (01)
  • [30] CRYSpred: Accurate Sequence-Based Protein Crystallization Propensity Prediction Using Sequence-Derived Structural Characteristics
    Mizianty, Marcin J.
    Kurgan, Lukasz A.
    PROTEIN AND PEPTIDE LETTERS, 2012, 19 (01) : 40 - 49