Comprehensive assessment of machine learning-based methods for predicting antimicrobial peptides

被引:82
|
作者
Xu, Jing [2 ,3 ]
Li, Fuyi [4 ]
Leier, Andre [5 ,6 ,7 ]
Xiang, Dongxu [2 ]
Shen, Hsin-Hui [8 ,9 ]
Lago, Tatiana T. Marquez [5 ,10 ,11 ]
Li, Jian [12 ,13 ]
Yu, Dong-Jun [1 ]
Song, Jiangning [12 ,14 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, 200 Xiaolingwei, Nanjing 210094, Peoples R China
[2] Monash Univ, Dept Biochem & Mol Biol, Clayton, Vic, Australia
[3] Monash Univ, Biomed Discovery Inst, Clayton, Vic, Australia
[4] Univ Melbourne, Peter Doherty Inst Infect & Immun, Dept Microbiol & Immunol, Melbourne, Vic, Australia
[5] UAB Sch Med, Dept Genet, Birmingham, AL USA
[6] UABs ONeal Comprehens Canc Ctr, Birmingham, AL USA
[7] Gregory Fleming James Cyst Fibrosis Res Ctr, Birmingham, AL USA
[8] Monash Univ, Dept Biochem & Mol Biol, Melbourne, Vic, Australia
[9] Monash Univ, Dept Mat Sci & Engn, Clayton, Vic, Australia
[10] UAB Sch Med, Dept Microbiol, Birmingham, AL USA
[11] UAB Gregory Fleming James Cyst Fibrosis Res Ct, Birmingham, AL USA
[12] Monash Univ, Monash Biomed Discovery Inst, Clayton, Vic, Australia
[13] Monash Univ, Dept Microbiol, Clayton, Vic, Australia
[14] Monash Univ, Monash Data Futures Inst, Clayton, Vic, Australia
基金
中国国家自然科学基金; 澳大利亚研究理事会; 英国医学研究理事会;
关键词
antimicrobial peptides; bioinformatics; machine learning; deep learning; feature engineering; predictors; AMINO-ACID-COMPOSITION; LOGISTIC-REGRESSION; WEB SERVER; CD-HIT; PROTEIN; DATABASE; CLASSIFICATION; EVOLUTIONARY; TOOL; DNA;
D O I
10.1093/bib/bbab083
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Antimicrobial peptides (AMPs) are a unique and diverse group of molecules that play a crucial role in a myriad of biological processes and cellular functions. AMP-related studies have become increasingly popular in recent years due to antimicrobial resistance, which is becoming an emerging global concern. Systematic experimental identification of AMPs faces many difficulties due to the limitations of current methods. Given its significance, more than 30 computational methods have been developed for accurate prediction of AMPs. These approaches show high diversity in their data set size, data quality, core algorithms, feature extraction, feature selection techniques and evaluation strategies. Here, we provide a comprehensive survey on a variety of current approaches for AMP identification and point at the differences between these methods. In addition, we evaluate the predictive performance of the surveyed tools based on an independent test data set containing 1536 AMPs and 1536 non-AMPs. Furthermore, we construct six validation data sets based on six different common AMP databases and compare different computational methods based on these data sets. The results indicate that amPEPpy achieves the best predictive performance and outperforms the other compared methods. As the predictive performances are affected by the different data sets used by different methods, we additionally perform the 5-fold cross-validation test to benchmark different traditional machine learning methods on the same data set. These cross-validation results indicate that random forest, support vector machine and eXtreme Gradient Boosting achieve comparatively better performances than other machine learning methods and are often the algorithms of choice of multiple AMP prediction tools.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] What can machine learning do for antimicrobial peptides, and what can antimicrobial peptides do for machine learning?
    Lee, Ernest Y.
    Lee, Michelle W.
    Fulan, Benjamin M.
    Ferguson, Andrew L.
    Wong, Gerard C. L.
    INTERFACE FOCUS, 2017, 7 (06)
  • [22] Machine Learning-Based Anomaly Detection in NFV: A Comprehensive Survey
    Zehra, Sehar
    Faseeha, Ummay
    Syed, Hassan Jamil
    Samad, Fahad
    Ibrahim, Ashraf Osman
    Abulfaraj, Anas W.
    Nagmeldin, Wamda
    SENSORS, 2023, 23 (11)
  • [23] Machine Learning-Based Analysis of Program Binaries: A Comprehensive Study
    Xue, Hongfa
    Sun, Shaowen
    Venkataramani, Guru
    Lan, Tian
    IEEE ACCESS, 2019, 7 : 65889 - 65912
  • [24] Learning to hash: a comprehensive survey of deep learning-based hashing methods
    Singh, Avantika
    Gupta, Shaifu
    KNOWLEDGE AND INFORMATION SYSTEMS, 2022, 64 (10) : 2565 - 2597
  • [25] Learning to hash: a comprehensive survey of deep learning-based hashing methods
    Avantika Singh
    Shaifu Gupta
    Knowledge and Information Systems, 2022, 64 : 2565 - 2597
  • [26] Machine learning-based spectrum occupancy prediction: a comprehensive survey
    Aygul, Mehmet Ali
    Cirpan, Hakan Ali
    Arslan, Huseyin
    FRONTIERS IN COMMUNICATIONS AND NETWORKS, 2025, 6
  • [27] Machine Learning-Based Method for Predicting Compressive Strength of Concrete
    Li, Daihong
    Tang, Zhili
    Kang, Qian
    Zhang, Xiaoyu
    Li, Youhua
    PROCESSES, 2023, 11 (02)
  • [28] Machine learning-based approach for predicting low birth weight
    Ranjbar, Amene
    Montazeri, Farideh
    Farashah, Mohammadsadegh Vahidi
    Mehrnoush, Vahid
    Darsareh, Fatemeh
    Roozbeh, Nasibeh
    BMC PREGNANCY AND CHILDBIRTH, 2023, 23 (01)
  • [29] Machine learning-based approach for predicting low birth weight
    Amene Ranjbar
    Farideh Montazeri
    Mohammadsadegh Vahidi Farashah
    Vahid Mehrnoush
    Fatemeh Darsareh
    Nasibeh Roozbeh
    BMC Pregnancy and Childbirth, 23
  • [30] Machine Learning-based Models for Predicting the Penetration Depth of Concrete
    Li M.
    Wu H.
    Dong H.
    Ren G.
    Zhang P.
    Huang F.
    Binggong Xuebao/Acta Armamentarii, 2023, 44 (12): : 3771 - 3782