Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins

被引：18

作者：

Zhang, Jian ^{[1
]}

Ghadermarzi, Sina ^{[2
]}

Kurgan, Lukasz ^{[2
]}

机构：

[1] Xinyang Normal Univ, Sch Comp & Informat Technol, Xinyang 464000, Peoples R China

[2] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA

来源：

BIOINFORMATICS | 2020年 / 36卷 / 18期

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

MOLECULAR RECOGNITION FEATURES; INTRINSIC DISORDER; INTERACTION SITES; COMPUTATIONAL PREDICTION; MORFS; IDENTIFICATION; REGIONS; RNA; DNA; SERVER;

D O I：

10.1093/bioinformatics/btaa573

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). Results: Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to crossover, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs.

引用

页码：4729 / 4738

页数：10

共 49 条

[41] THPLM: a sequence-based deep learning framework for protein stability changes prediction upon point variations using pretrained protein language model
Gong, Jianting
Jiang, Lili
Chen, Yongbing
Zhang, Yixiang
Li, Xue
Ma, Zhiqiang
Fu, Zhiguo
He, Fei
Sun, Pingping
Ren, Zilin
Tian, Mingyao
BIOINFORMATICS, 2023, 39 (11)
[42] An improved sequence-based prediction protocol for protein-protein interactions using amino acids substitution matrix and rotation forest ensemble classifiers
You, Zhu-Hong
Li, Xiao
Chan, Keith C. C.
NEUROCOMPUTING, 2017, 228 : 277 - 282
[43] Meta-iPVP: a sequence-based meta-predictor for improving the prediction of phage virion proteins using effective feature representation
Charoenkwan, Phasit
Nantasenamat, Chanin
Hasan, Md. Mehedi
Shoombuatong, Watshara
JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2020, 34 (10) : 1105 - 1116
[44] PCSPred&x005F;SC: Prediction of Protein Citrullination Sites Using an Effective Sequence-Based Combined Method
Zhang, Lina
Chen, Jingui
Zhang, Chengjin
Gao, Rui
Yang, Runtao
IEEE ACCESS, 2020, 8 : 88453 - 88463
[45] Sequence-based prediction of physicochemical interactions at protein functional sites using a function-and-interaction-annotated domain profile database
Han, Min
Song, Yifan
Qian, Jiaqiang
Ming, Dengming
BMC BIOINFORMATICS, 2018, 19
[46] An improved sequence based prediction protocol for DNA-binding proteins using SVM and comprehensive feature analysis
Zou, Chuanxin
Gong, Jiayu
Li, Honglin
BMC BIOINFORMATICS, 2013, 14
[47] Sequence Based Prediction of DNA-Binding Proteins Based on Hybrid Feature Selection Using Random Forest and Gaussian Naive Bayes
Lou, Wangchao
Wang, Xiaoqing
Chen, Fan
Chen, Yixiao
Jiang, Bo
Zhang, Hua
PLOS ONE, 2014, 9 (01):
[48] MPLs-Pred: Predicting Membrane Protein-Ligand Binding Sites Using Hybrid Sequence-Based Features and Ligand-Specific Models
Lu, Chang
Liu, Zhe
Zhang, Enju
He, Fei
Ma, Zhiqiang
Wang, Han
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2019, 20 (13)
[49] iNucRes-ASSH: Identifying nucleic acid-binding residues in proteins by using self-attention-based structure-sequence hybrid neural network
Zhang, Jun
Chen, Qingcai
Liu, Bin
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2024, 92 (03) : 395 - 410

← 1 2 3 4 5 →