Accurately Predicting Glutarylation Sites Using Sequential Bi-Peptide-Based Evolutionary Features

被引:17
作者
Arafat, Md Easin [1 ]
Ahmad, Md Wakil [1 ]
Shovan, S. M. [2 ]
Dehzangi, Abdollah [3 ,4 ]
Dipta, Shubhashis Roy [1 ]
Hasan, Md Al Mehedi [2 ]
Taherzadeh, Ghazaleh [5 ]
Shatabda, Swakkhar [1 ]
Sharma, Alok [6 ,7 ,8 ,9 ]
机构
[1] United Int Univ, Dept Comp Sci & Engn, Dhaka 1212, Bangladesh
[2] Rajshahi Univ Engn & Technol, Dept Comp Sci & Engn, Rajshahi 6204, Bangladesh
[3] Rutgers State Univ, Dept Comp Sci, Camden, NJ 08102 USA
[4] Rutgers State Univ, Ctr Computat & Integrat Biol, Camden, NJ 08102 USA
[5] Univ Maryland, Inst Biosci & Biotechnol Res, College Pk, MD 20742 USA
[6] Griffith Univ, Inst Integrated & Intelligent Syst, Brisbane, Qld 4111, Australia
[7] Tokyo Med & Dent Univ TMDU, Dept Med Sci Math, Tokyo 1138510, Japan
[8] RIKEN Ctr Integrat Med Sci, Lab Med Sci Math, Yokohama, Kanagawa 2300045, Japan
[9] Univ South Pacific, Fac Sci Technol & Environm, Sch Engn & Phys, Suva, Fiji
关键词
post-translational modification; lysine Glutarylation; machine learning; extra-trees classifier; bi-peptide evolutionary features; LYSINE SUCCINYLATION SITES; HOMOLOGY-BASED PREDICTION; POSTTRANSLATIONAL MODIFICATIONS; SCORING MATRIX; MALONYLATION; PROTEINS; IDENTIFICATION; LOCALIZATION; LOCATIONS; RESIDUES;
D O I
10.3390/genes11091023
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Post Translational Modification (PTM) is defined as the alteration of protein sequence upon interaction with different macromolecules after the translation process. Glutarylation is considered one of the most important PTMs, which is associated with a wide range of cellular functioning, including metabolism, translation, and specified separate subcellular localizations. During the past few years, a wide range of computational approaches has been proposed to predict Glutarylation sites. However, despite all the efforts that have been made so far, the prediction performance of the Glutarylation sites has remained limited. One of the main challenges to tackle this problem is to extract features with significant discriminatory information. To address this issue, we propose a new machine learning method called BiPepGlut using the concept of a bi-peptide-based evolutionary method for feature extraction. To build this model, we also use the Extra-Trees (ET) classifier for the classification purpose, which, to the best of our knowledge, has never been used for this task. Our results demonstrate BiPepGlut is able to significantly outperform previously proposed models to tackle this problem. BiPepGlut achieves 92.0%, 84.8%, 95.6%, 0.82, and 0.88 in accuracy, sensitivity, specificity, Matthew's Correlation Coefficient, and F1-score, respectively. BiPepGlut is implemented as a publicly available online predictor.
引用
收藏
页数:16
相关论文
共 58 条
  • [1] Ahmad M.W, 2019, P 3 INT C EL COMP TE
  • [2] Ahmad MW, 2020, IEEE ACCESS, V8, P77888, DOI [10.1109/access.2020.2989713, 10.1109/ACCESS.2020.2989713]
  • [3] Ahmed M.W, 2020, P IEEE REG 10 S TENS
  • [4] RF-MaloSite and DL-Malosite: Methods based on random forest and deep learning to identify malonylation sites
    AL-barakati, Hussam
    Thapa, Niraj
    Hiroto, Saigo
    Roy, Kaushik
    Newman, Robert H.
    Kc, Dukka
    [J]. COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2020, 18 (18): : 852 - 860
  • [5] RF-GlutarySite: a random forest based predictor for glutarylation sites
    AL-barakati, Hussam J.
    Saigo, Hiroto
    Newman, Robert H.
    Dukka, B. Kc
    [J]. MOLECULAR OMICS, 2019, 15 (03) : 189 - 204
  • [6] PhoglyStruct: Prediction of phosphoglycerylated lysine residues using structural properties of amino acids
    Chandra, Abel
    Sharma, Alok
    Dehzangi, Abdollah
    Ranganathan, Shoba
    Jokhan, Anjeela
    Chou, Kuo-Chen
    Tsunoda, Tatsuhiko
    [J]. SCIENTIFIC REPORTS, 2018, 8
  • [7] DeepRMethylSite: a deep learning based approach for prediction of arginine methylation sites in proteins
    Chaudhari, Meenal
    Thapa, Niraj
    Roy, Kaushik
    Newman, Robert H.
    Saigo, Hiroto
    Dukka, B. K. C.
    [J]. MOLECULAR OMICS, 2020, 16 (05) : 448 - 454
  • [8] Taxonomy based performance metrics for evaluating taxonomic assignment methods
    Chen, Chung-Yen
    Tang, Sen-Lin
    Chou, Seng-Cho T.
    [J]. BMC BIOINFORMATICS, 2019, 20
  • [9] Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
    Chen, Zhen
    He, Ningning
    Huang, Yu
    Qin, Wen Tao
    Liu, Xuhan
    Li, Lei
    [J]. GENOMICS PROTEOMICS & BIOINFORMATICS, 2018, 16 (06) : 451 - 459
  • [10] Large-scale comparative assessment of computational predictors for lysine post-translational modification sites
    Chen, Zhen
    Liu, Xuhan
    Li, Fuyi
    Li, Chen
    Marquez-Lago, Tatiana
    Leier, Andre
    Akutsu, Tatsuya
    Webb, Geoffrey, I
    Xu, Dakang
    Smith, Alexander Ian
    Li, Lei
    Chou, Kuo-Chen
    Song, Jiangning
    [J]. BRIEFINGS IN BIOINFORMATICS, 2019, 20 (06) : 2267 - 2290