IDPpred: a new sequence-based predictor for identification of intrinsically disordered protein with enhanced accuracy

被引:1
作者
Chaurasiya, Deepak [1 ]
Mondal, Rajkrishna [2 ]
Lahiri, Tapobrata [1 ]
Tripathi, Asmita [1 ]
Ghinmine, Tejas [1 ]
机构
[1] Indian Inst Informat Technol, Dept Appl Sci, Prayagraj, Uttar Pradesh, India
[2] Nagaland Univ, Dept Biotechnol, Dimapur, Nagaland, India
关键词
Intrinsically disordered protein; numerical representation of sequence; periodicity count value and predictor; REGIONS;
D O I
10.1080/07391102.2023.2290615
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Discovery of intrinsically disordered proteins (IDPs) and protein hybrids that contain both intrinsically disordered protein regions (IDPRs) along with ordered regions has changed the sequence-structure-function paradigm of protein. These proteins with lack of persistently fixed structure are often found in all organisms and play vital roles in various biological processes. Some of them are considered as potential drug targets due to their overrepresentation in pathophysiological processes. The major bottlenecks for characterizing such proteins are their occasional overexpression, difficulty in getting purified homogeneous form and the challenge of investigating them experimentally. Sequence-based prediction of intrinsic disorder remains a useful strategy especially for many large-scale proteomic investigations. However, worst accuracy still occurs for short disordered regions with less than ten residues, for the residues close to order-disorder boundaries, for regions that undergo coupled folding and binding in presence of partner, and for prediction of fully disordered proteins. Annotation of fully disordered proteins mostly relies on the far-UV circular dichroism experiment which gives overall secondary structure composition without residue-level resolution. Current methods including that using secondary structure information failed to predict half of target IDPs correctly in the recent Critical Assessment of protein Intrinsic Disorder prediction (CAID) experiment. This study utilized profiles of random sequential appearance of physicochemical properties of amino acids and random sequential appearance of order and disorder promoting amino acids in protein together with the existing CIDER feature for the prediction of IDP from sequence input. Our method was found to significantly outperform the existing predictors across different datasets.
引用
收藏
页码:957 / 965
页数:9
相关论文
共 27 条
[21]   Improving Sequence-Based Prediction of Protein Peptide Binding Residues by Introducing Intrinsic Disorder and a Consensus Method [J].
Zhao, Zijuan ;
Peng, Zhenling ;
Yang, Jianyi .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2018, 58 (07) :1459-1468
[22]   Predictions of Backbone Dynamics in Intrinsically Disordered Proteins Using De Novo Fragment- Based Protein Structure Predictions [J].
Kosciolek, Tomasz ;
Buchan, Daniel W. A. ;
Jones, David T. .
SCIENTIFIC REPORTS, 2017, 7
[23]   Sequence-based identification of amyloidogenic β-hairpins reveals a prostatic acid phosphatase fragment promoting semen amyloid formation [J].
Heid, Laetitia F. ;
Agerschou, Emil Dandanell ;
Orr, Asuka A. ;
Kupreichyk, Tatsiana ;
Schneider, Walfried ;
Woerdehoff, Michael M. ;
Schwarten, Melanie ;
Willbold, Dieter ;
Tamamis, Phanourios ;
Stoldt, Matthias ;
Hoyer, Wolfgang .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2024, 23 :417-430
[24]   A model for identification of potential phase-separated proteins based on protein sequence, structure and cellular distribution [J].
Wang, Jiyan ;
Chang, Hongkai ;
Quan, Xiaojing ;
Dai, Xintong ;
Wang, Yan ;
Wang, Chenxi ;
Zhang, Shuai ;
Shan, Changliang .
INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2023, 243
[25]   BERT-Promoter: An improved sequence-based predictor of DNA promoter using BERT pre-trained model and SHAP feature selection [J].
Le, Nguyen Quoc Khanh ;
Ho, Quang-Thai ;
Nguyen, Van-Nui ;
Chang, Jung-Su .
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2022, 99
[26]   A new technique for predicting intrinsically disordered regions based on average distance map constructed with inter-residue average distance statistics [J].
Shimomura, Takumi ;
Nishijima, Kohki ;
Kikuchi, Takeshi .
BMC STRUCTURAL BIOLOGY, 2019, 19
[27]   Ligand-based optimization and biological evaluation of N-(2,2,2-trichloro-1-(3-phenylthioureido)ethyl)acetamide derivatives as potent intrinsically disordered protein c-Myc inhibitors [J].
Chen, Limin ;
Cheng, Beiming ;
Sun, Qi ;
Lai, Luhua .
BIOORGANIC & MEDICINAL CHEMISTRY LETTERS, 2021, 31