Sequence-based Prediction of Antimicrobial Peptides with CatBoost Classifier

被引:0
|
作者
Yu, Jen-Chieh [1 ]
Ni, Kuan [1 ]
Chen, Ching-Tai [2 ]
机构
[1] Asia Univ, Dept Bioinformat & Med Engn, Taichung, Taiwan
[2] Asia Univ, Dept Bioinformat & Med Engn, Ctr Precis Hlth Res, Taichung, Taiwan
来源
2022 IEEE 22ND INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE 2022) | 2022年
关键词
antimicrobial peptide prediction; therapentic peptide; disease; machine learning; bioinformatics; AMINO-ACID-COMPOSITION; FEATURE-SELECTION; PROTEIN; ANTIBACTERIAL;
D O I
10.1109/BIBE55377.2022.00053
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Antimicrobial resistance is one of the most serious issue for human health. Compared to existing antibiotics, antimicrobial peptides have the advantage of efficient killing microbes and other pathogens without inducing drug resistance. Large-scale experimental methods to characterize AMPs require wet-lab resources and longer time. In silico prediction of AMP, on the other hand, is an attractive strategy to lower the cost and time in the discovery of new AMPs. In this study, we proposed a CatBoost model for AMP prediction. We included various features for numerical representation of peptides, and then employed a systematic approach to select 130 important features for our machine learning models. The CatBoost model achieves an accuracy, F1-score, MCC, and AUC of 0.758, 0.750, 0.518, and 0.831, respectively, for cross validation. For an independent test based on 188 peptide sequences, the proposed model achieves an accuracy, MCC, and AUC of 0.814, 0.632, and 0.884, respectively, all of which are the best compared to five state-of-art methods. Our model improves the MCC of five existing methods by 2.6% to 21.1%, and improves the AUC of them by 1.3% to 13.3%, respectively. The results demonstrate that our CatBoost model is capable of yielding reliable results, and can be of great help in discovering novel AMPs.
引用
收藏
页码:217 / 220
页数:4
相关论文
共 50 条
  • [21] Sequence-Based Viscosity Prediction for Rapid Antibody Engineering
    Estes, Bram
    Jain, Mani
    Jia, Lei
    Whoriskey, John
    Bennett, Brian
    Hsu, Hailing
    BIOMOLECULES, 2024, 14 (06)
  • [22] De novo sequence-based method for ncRPI prediction using structural information
    Leone, Michele
    Galvani, Marta
    Masseroli, Marco
    2019 IEEE 19TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOENGINEERING (BIBE), 2019, : 146 - 151
  • [23] BBPpred: Sequence-Based Prediction of Blood-Brain Barrier Peptides with Feature Representation Learning and Logistic Regression
    Dai, Ruyu
    Zhang, Wei
    Tang, Wending
    Wynendaele, Evelien
    Zhu, Qizhi
    Bin, Yannan
    De Spiegeleer, Bart
    Xia, Junfeng
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (01) : 525 - 534
  • [24] A Novel Amino Acid Sequence-based Computational Approach to Predicting Cell-penetrating Peptides
    Tang, Jihui
    Ning, Jie
    Liu, Xiaoyan
    Wu, Baoming
    Hu, Rongfeng
    CURRENT COMPUTER-AIDED DRUG DESIGN, 2019, 15 (03) : 206 - 211
  • [25] CPPred-RF: A Sequence-based Predictor for Identifying Cell Penetrating Peptides and Their Uptake Efficiency
    Wei, Leyi
    Xing, PengWei
    Su, Ran
    Shi, Gaotao
    Ma, Zhanshan Sam
    Zou, Quan
    JOURNAL OF PROTEOME RESEARCH, 2017, 16 (05) : 2044 - 2053
  • [26] ATPsite: sequence-based prediction of ATP-binding residues
    Chen, Ke
    Mizianty, Marcin J.
    Kurgan, Lukasz
    PROTEOME SCIENCE, 2011, 9
  • [27] Recent developments of sequence-based prediction of protein–protein interactions
    Yoichi Murakami
    Kenji Mizuguchi
    Biophysical Reviews, 2022, 14 : 1393 - 1411
  • [28] Cofactory: Sequence-based prediction of cofactor specificity of Rossmann folds
    Geertz-Hansen, Henrik Marcus
    Blom, Nikolaj
    Feist, Adam M.
    Brunak, Soren
    Petersen, Thomas Nordahl
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2014, 82 (09) : 1819 - 1828
  • [29] A novel sequence-based prediction method for ATP-binding sites using fusion of SMOTE algorithm and random forests classifier
    Song, Jiazhi
    Liu, Guixia
    Song, Chuyi
    Jiang, Jingqing
    BIOTECHNOLOGY & BIOTECHNOLOGICAL EQUIPMENT, 2020, 34 (01) : 1337 - 1347
  • [30] EBGW_OMP: A Sequence-based Method for Accurate Prediction of Outer Membrane Proteins
    Zou, Lingyun
    Ni, Qingshan
    Hu, Fuquan
    2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2014,