Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins

被引：18

作者：

Zhang, Jian ^{[1
]}

Ghadermarzi, Sina ^{[2
]}

Kurgan, Lukasz ^{[2
]}

机构：

[1] Xinyang Normal Univ, Sch Comp & Informat Technol, Xinyang 464000, Peoples R China

[2] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA

来源：

BIOINFORMATICS | 2020年 / 36卷 / 18期

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

MOLECULAR RECOGNITION FEATURES; INTRINSIC DISORDER; INTERACTION SITES; COMPUTATIONAL PREDICTION; MORFS; IDENTIFICATION; REGIONS; RNA; DNA; SERVER;

D O I：

10.1093/bioinformatics/btaa573

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). Results: Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to crossover, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs.

引用

页码：4729 / 4738

页数：10

共 49 条

[11] SNBRFinder: A Sequence-Based Hybrid Algorithm for Enhanced Prediction of Nucleic Acid-Binding Residues
Yang, Xiaoxia
Wang, Jia
Sun, Jun
Liu, Rong
PLOS ONE, 2015, 10 (07):
[12] PROBselect: accurate prediction of protein-binding residues from proteins sequences via dynamic predictor selection
Zhang, Fuhao
Shi, Wenbo
Zhang, Jian
Zeng, Min
Li, Min
Kurgan, Lukasz
BIOINFORMATICS, 2020, 36 : I735 - I744
[13] Accurate Sequence-Based Prediction of Deleterious nsSNPs with Multiple Sequence Profiles and Putative Binding Residues
Song, Ruiyang
Cao, Baixin
Peng, Zhenling
Oldfield, Christopher J.
Kurgan, Lukasz
Wong, Ka-Chun
Yang, Jianyi
BIOMOLECULES, 2021, 11 (09)
[14] Efficient mapping of RNA-binding residues in RNA-binding proteins using local sequence features of binding site residues in protein-RNA complexes
Agarwal, Ankita
Kant, Shri
Bahadur, Ranjit Prasad
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2023, 91 (09) : 1361 - 1379
[15] Sequence-Based Prediction of Protein-Peptide Binding Sites Using Support Vector Machine
Taherzadeh, Ghazaleh
Yang, Yuedong
Zhang, Tuo
Liew, Alan Wee-Chung
Zhou, Yaoqi
JOURNAL OF COMPUTATIONAL CHEMISTRY, 2016, 37 (13) : 1223 - 1229
[16] PMSFF: Improved Protein Binding Residues Prediction through Multi-Scale Sequence-Based Feature Fusion Strategy
Li, Yuguang
Nan, Xiaofei
Zhang, Shoutao
Zhou, Qinglei
Lu, Shuai
Tian, Zhen
BIOMOLECULES, 2024, 14 (10)
[17] A Novel Sequence-Based Method of Predicting Protein DNA-Binding Residues, Using a Machine Learning Approach
Cai, Yudong
He, ZhiSong
Shi, Xiaohe
Kong, Xiangying
Gu, Lei
Xie, Lu
MOLECULES AND CELLS, 2010, 30 (02) : 99 - 105
[18] Sequence-based prediction of protein binding regions and drug-target interactions
Lee, Ingoo
Nam, Hojung
JOURNAL OF CHEMINFORMATICS, 2022, 14 (01)
[19] The s2D Method: Simultaneous Sequence-Based Prediction of the Statistical Populations of Ordered and Disordered Regions in Proteins
Sormanni, Pietro
Camilloni, Carlo
Fariselli, Piero
Vendruscolo, Michele
JOURNAL OF MOLECULAR BIOLOGY, 2015, 427 (04) : 982 - 996
[20] Prediction of Intrinsically Disordered Proteins Using Machine Learning Based on Low Complexity Methods
Zeng, Xingming
Liu, Haiyuan
He, Hao
ALGORITHMS, 2022, 15 (03)

← 1 2 3 4 5 →