Prediction of protein-binding residues: dichotomy of sequence-based methods developed using structured complexes versus disordered proteins

被引：18

作者：

Zhang, Jian ^{[1
]}

Ghadermarzi, Sina ^{[2
]}

Kurgan, Lukasz ^{[2
]}

机构：

[1] Xinyang Normal Univ, Sch Comp & Informat Technol, Xinyang 464000, Peoples R China

[2] Virginia Commonwealth Univ, Dept Comp Sci, Richmond, VA 23284 USA

来源：

BIOINFORMATICS | 2020年 / 36卷 / 18期

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

MOLECULAR RECOGNITION FEATURES; INTRINSIC DISORDER; INTERACTION SITES; COMPUTATIONAL PREDICTION; MORFS; IDENTIFICATION; REGIONS; RNA; DNA; SERVER;

D O I：

10.1093/bioinformatics/btaa573

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Motivation: There are over 30 sequence-based predictors of the protein-binding residues (PBRs). They use either structure-annotated or disorder-annotated training datasets, potentially creating a dichotomy where the structure-/disorder-specific models may not be able to cross-over to accurately predict the other type. Moreover, the structure-trained predictors were shown to substantially cross-predict PBRs among residues that interact with non-protein partners (nucleic acids and small ligands). We address these issues by performing first-of-its-kind comparative study of a representative collection of disorder- and structure-trained predictors using a comprehensive benchmark set with the structure- and disorder-derived annotations of PBRs (to analyze the cross-over) and the protein-, nucleic acid- and small ligand-binding proteins (to study the cross-predictions). Results: Three predictors provide accurate results: SCRIBER, ANCHOR and disoRDPbind. Some of the structure-trained methods make accurate predictions on the structure-annotated proteins. Similarly, the disorder-trained predictors predict well on the disorder-annotated proteins. However, the considered predictors generally fail to crossover, with the exception of SCRIBER. Our study also reveals that virtually all methods substantially cross-predict PBRs, except for SCRIBER for the structure-annotated proteins and disoRDPbind for the disorder-annotated proteins. We formulate a novel hybrid predictor, hybridPBRpred, that combines results produced by disoRDPbind and SCRIBER to accurately predict disorder- and structure-annotated PBRs. HybridPBRpred generates accurate results that cross-over structure- and disorder-annotated proteins and produces relatively low amount of cross-predictions, offering an accurate alternative to predict PBRs.

引用

页码：4729 / 4738

页数：10

共 49 条

[21] MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins
Disfani, Fatemeh Miri
Hsu, Wei-Lun
Mizianty, Marcin J.
Oldfield, Christopher J.
Xue, Bin
Dunker, A. Keith
Uversky, Vladimir N.
Kurgan, Lukasz
BIOINFORMATICS, 2012, 28 (12) : I75 - I83
[22] A Sequence-Based Prediction Model of Vesicular Transport Proteins Using Ensemble Deep Learning
Le, Nguyen Quoc Khanh
Kha, Quang Hien
14TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, BCB 2023, 2023,
[23] A Sequence-Based Dynamic Ensemble Learning System for Protein Ligand-Binding Site Prediction
Chen, Peng
Hu, ShanShan
Zhang, Jun
Gao, Xin
Li, Jinyan
Xia, Junfeng
Wang, Bing
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2016, 13 (05) : 901 - 912
[24] Sequence-Based Prediction of RNA-Binding Proteins Using Random Forest with Minimum Redundancy Maximum Relevance Feature Selection
Ma, Xin
Guo, Jing
Sun, Xiao
BIOMED RESEARCH INTERNATIONAL, 2015, 2015
[25] Improving the prediction of protein-nucleic acids binding residues via multiple sequence profiles and the consensus of complementary methods
Su, Hong
Liu, Mengchen
Sun, Saisai
Peng, Zhenling
Yang, Jianyi
BIOINFORMATICS, 2019, 35 (06) : 930 - 936
[26] Prediction of microRNA-binding residues in protein using a Laplacian support vector machine based on sequence information
Ma, Xin
Guo, Jing
Sun, Xiao
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2018, 16 (03)
[27] Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences
Park, Yungki
BMC BIOINFORMATICS, 2009, 10 : 419
[28] A sequence-based prediction of Kruppel-like factors proteins using XGBoost and optimized features
Le Nguyen Quoc Khanh
Do Duyen Thi
Nguyen Trinh-Trung-Duong
Le Quynh Anh
GENE, 2021, 787
[29] VacPred: Sequence-based prediction of plant vacuole proteins using machine-learning techniques
Yadav, Arvind Kumar
Singla, Deepak
JOURNAL OF BIOSCIENCES, 2020, 45 (01)
[30] CRYSpred: Accurate Sequence-Based Protein Crystallization Propensity Prediction Using Sequence-Derived Structural Characteristics
Mizianty, Marcin J.
Kurgan, Lukasz A.
PROTEIN AND PEPTIDE LETTERS, 2012, 19 (01) : 40 - 49

← 1 2 3 4 5 →