Protein Language Models and Machine Learning Facilitate the Identification of Antimicrobial Peptides

被引:2
作者
Medina-Ortiz, David [1 ,2 ]
Contreras, Seba [3 ]
Fernandez, Diego [1 ]
Soto-Garcia, Nicole [1 ]
Moya, Ivan [1 ,4 ]
Cabas-Mora, Gabriel [1 ]
Olivera-Nappa, Alvaro [2 ,5 ]
机构
[1] Univ Magallanes, Dept Ingn Comp, Punta Arenas 6210005, Chile
[2] Univ Chile, Ctr Biotechnol & Bioengn, CeBiB, Santiago 8370456, Chile
[3] Max Planck Inst Dynam & Self Org, Fassberg 17, D-37077 Gottingen, Germany
[4] Univ Magallanes, Dept Ingn Quim, Punta Arenas 6210005, Chile
[5] Univ Chile, Dept Ingn Quim Biotecnol & Mat, Santiago 8370456, Chile
关键词
antimicrobial peptides; machine learning; protein language models; generative learning; peptide discovery; peptide design; PREDICTION; CLASSIFICATION; DESIGN;
D O I
10.3390/ijms25168851
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Peptides are bioactive molecules whose functional versatility in living organisms has led to successful applications in diverse fields. In recent years, the amount of data describing peptide sequences and function collected in open repositories has substantially increased, allowing the application of more complex computational models to study the relations between the peptide composition and function. This work introduces AMP-Detector, a sequence-based classification model for the detection of peptides' functional biological activity, focusing on accelerating the discovery and de novo design of potential antimicrobial peptides (AMPs). AMP-Detector introduces a novel sequence-based pipeline to train binary classification models, integrating protein language models and machine learning algorithms. This pipeline produced 21 models targeting antimicrobial, antiviral, and antibacterial activity, achieving average precision exceeding 83%. Benchmark analyses revealed that our models outperformed existing methods for AMPs and delivered comparable results for other biological activity types. Utilizing the Peptide Atlas, we applied AMP-Detector to discover over 190,000 potential AMPs and demonstrated that it is an integrative approach with generative learning to aid in de novo design, resulting in over 500 novel AMPs. The combination of our methodology, robust models, and a generative design strategy offers a significant advancement in peptide-based drug discovery and represents a pivotal tool for therapeutic applications.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Identification of Scientific Texts Generated by Large Language Models Using Machine Learning
    Soto-Osorio, David
    Sidorov, Grigori
    Chanona-Hernandez, Liliana
    Lopez-Ramirez, Blanca Cecilia
    [J]. COMPUTERS, 2024, 13 (12)
  • [22] Combining genetic algorithm with machine learning strategies for designing potent antimicrobial peptides
    Boone, Kyle
    Wisdom, Cate
    Camarda, Kyle
    Spencer, Paulette
    Tamerler, Candan
    [J]. BMC BIOINFORMATICS, 2021, 22 (01)
  • [23] Machine learning-enabled predictive modeling to precisely identify the antimicrobial peptides
    Mushtaq Ahmad Wani
    Prabha Garg
    Kuldeep K. Roy
    [J]. Medical & Biological Engineering & Computing, 2021, 59 : 2397 - 2408
  • [24] CalcAMP: A New Machine Learning Model for the Accurate Prediction of Antimicrobial Activity of Peptides
    Bournez, Colin
    Riool, Martijn
    de Boer, Leonie
    Cordfunke, Robert A.
    de Best, Leonie
    van Leeuwen, Remko
    Drijfhout, Jan Wouter
    Zaat, Sebastian A. J.
    van Westen, Gerard J. P.
    [J]. ANTIBIOTICS-BASEL, 2023, 12 (04):
  • [25] Machine learning-enabled predictive modeling to precisely identify the antimicrobial peptides
    Wani, Mushtaq Ahmad
    Garg, Prabha
    Roy, Kuldeep K.
    [J]. MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2021, 59 (11-12) : 2397 - 2408
  • [26] Machine learning for antimicrobial peptide identification and design
    Wan, Fangping
    Wong, Felix
    Collins, James J.
    de la Fuente-nunez, Cesar
    [J]. NATURE REVIEWS BIOENGINEERING, 2024, 2 (05): : 392 - 407
  • [27] Characterization and Identification of Natural Antimicrobial Peptides on Different Organisms
    Chung, Chia-Ru
    Jhong, Jhih-Hua
    Wang, Zhuo
    Chen, Siyu
    Wan, Yu
    Horng, Jorng-Tzong
    Lee, Tzong-Yi
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (03)
  • [28] Identification of Phishing URLs Using Machine Learning Models
    Vivek, Meghashyam
    Premjith, Nithin
    Johnson, Aaron Antonio
    Maurya, Ashutosh Kumar
    Jingle, I. Diana Jeba
    [J]. FOURTH CONGRESS ON INTELLIGENT SYSTEMS, VOL 3, CIS 2023, 2024, 865 : 209 - 219
  • [29] Machine learning designs non-hemolytic antimicrobial peptides
    Capecchi, Alice
    Cai, Xingguang
    Personne, Hippolyte
    Kohler, Thilo
    van Delden, Christian
    Reymond, Jean-Louis
    [J]. CHEMICAL SCIENCE, 2021, 12 (26) : 9221 - 9232
  • [30] Systematic Identification of Machine-Learning Models Aimed to Classify Critical Residues for Protein Function from Protein Structure
    Corral-Corral, Ricardo
    Beltran, Jesus A.
    Brizuela, Carlos A.
    Del Rio, Gabriel
    [J]. MOLECULES, 2017, 22 (10)