Accurately Predicting Glutarylation Sites Using Sequential Bi-Peptide-Based Evolutionary Features

被引:17
作者
Arafat, Md Easin [1 ]
Ahmad, Md Wakil [1 ]
Shovan, S. M. [2 ]
Dehzangi, Abdollah [3 ,4 ]
Dipta, Shubhashis Roy [1 ]
Hasan, Md Al Mehedi [2 ]
Taherzadeh, Ghazaleh [5 ]
Shatabda, Swakkhar [1 ]
Sharma, Alok [6 ,7 ,8 ,9 ]
机构
[1] United Int Univ, Dept Comp Sci & Engn, Dhaka 1212, Bangladesh
[2] Rajshahi Univ Engn & Technol, Dept Comp Sci & Engn, Rajshahi 6204, Bangladesh
[3] Rutgers State Univ, Dept Comp Sci, Camden, NJ 08102 USA
[4] Rutgers State Univ, Ctr Computat & Integrat Biol, Camden, NJ 08102 USA
[5] Univ Maryland, Inst Biosci & Biotechnol Res, College Pk, MD 20742 USA
[6] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld 4111, Australia
[7] Tokyo Med & Dent Univ TMDU, Dept Med Sci Math, Tokyo 1138510, Japan
[8] RIKEN Ctr Integrat Med Sci, Lab Med Sci Math, Yokohama, Kanagawa 2300045, Japan
[9] Univ South Pacific, Fac Sci Technol & Environm, Sch Engn & Phys, Suva, Fiji
关键词
post-translational modification; lysine Glutarylation; machine learning; extra-trees classifier; bi-peptide evolutionary features; LYSINE SUCCINYLATION SITES; HOMOLOGY-BASED PREDICTION; POSTTRANSLATIONAL MODIFICATIONS; SCORING MATRIX; MALONYLATION; PROTEINS; IDENTIFICATION; LOCALIZATION; LOCATIONS; RESIDUES;
D O I
10.3390/genes11091023
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Post Translational Modification (PTM) is defined as the alteration of protein sequence upon interaction with different macromolecules after the translation process. Glutarylation is considered one of the most important PTMs, which is associated with a wide range of cellular functioning, including metabolism, translation, and specified separate subcellular localizations. During the past few years, a wide range of computational approaches has been proposed to predict Glutarylation sites. However, despite all the efforts that have been made so far, the prediction performance of the Glutarylation sites has remained limited. One of the main challenges to tackle this problem is to extract features with significant discriminatory information. To address this issue, we propose a new machine learning method called BiPepGlut using the concept of a bi-peptide-based evolutionary method for feature extraction. To build this model, we also use the Extra-Trees (ET) classifier for the classification purpose, which, to the best of our knowledge, has never been used for this task. Our results demonstrate BiPepGlut is able to significantly outperform previously proposed models to tackle this problem. BiPepGlut achieves 92.0%, 84.8%, 95.6%, 0.82, and 0.88 in accuracy, sensitivity, specificity, Matthew's Correlation Coefficient, and F1-score, respectively. BiPepGlut is implemented as a publicly available online predictor.
引用
收藏
页数:16
相关论文
共 58 条
[1]  
Ahmad M.W, 2019, P 3 INT C EL COMP TE
[2]  
Ahmad MW, 2020, IEEE ACCESS, V8, P77888, DOI [10.1109/access.2020.2989713, 10.1109/ACCESS.2020.2989713]
[3]  
Ahmed M.W, 2020, P IEEE REG 10 S TENS
[4]   RF-MaloSite and DL-Malosite: Methods based on random forest and deep learning to identify malonylation sites [J].
AL-barakati, Hussam ;
Thapa, Niraj ;
Hiroto, Saigo ;
Roy, Kaushik ;
Newman, Robert H. ;
Kc, Dukka .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2020, 18 :852-860
[5]   RF-GlutarySite: a random forest based predictor for glutarylation sites [J].
AL-barakati, Hussam J. ;
Saigo, Hiroto ;
Newman, Robert H. ;
Dukka, B. Kc .
MOLECULAR OMICS, 2019, 15 (03) :189-204
[6]   PhoglyStruct: Prediction of phosphoglycerylated lysine residues using structural properties of amino acids [J].
Chandra, Abel ;
Sharma, Alok ;
Dehzangi, Abdollah ;
Ranganathan, Shoba ;
Jokhan, Anjeela ;
Chou, Kuo-Chen ;
Tsunoda, Tatsuhiko .
SCIENTIFIC REPORTS, 2018, 8
[7]   DeepRMethylSite: a deep learning based approach for prediction of arginine methylation sites in proteins [J].
Chaudhari, Meenal ;
Thapa, Niraj ;
Roy, Kaushik ;
Newman, Robert H. ;
Saigo, Hiroto ;
Dukka, B. K. C. .
MOLECULAR OMICS, 2020, 16 (05) :448-454
[8]   Taxonomy based performance metrics for evaluating taxonomic assignment methods [J].
Chen, Chung-Yen ;
Tang, Sen-Lin ;
Chou, Seng-Cho T. .
BMC BIOINFORMATICS, 2019, 20
[9]   Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites [J].
Chen, Zhen ;
He, Ningning ;
Huang, Yu ;
Qin, Wen Tao ;
Liu, Xuhan ;
Li, Lei .
GENOMICS PROTEOMICS & BIOINFORMATICS, 2018, 16 (06) :451-459
[10]   Large-scale comparative assessment of computational predictors for lysine post-translational modification sites [J].
Chen, Zhen ;
Liu, Xuhan ;
Li, Fuyi ;
Li, Chen ;
Marquez-Lago, Tatiana ;
Leier, Andre ;
Akutsu, Tatsuya ;
Webb, Geoffrey, I ;
Xu, Dakang ;
Smith, Alexander Ian ;
Li, Lei ;
Chou, Kuo-Chen ;
Song, Jiangning .
BRIEFINGS IN BIOINFORMATICS, 2019, 20 (06) :2267-2290