Protein Language Models and Machine Learning Facilitate the Identification of Antimicrobial Peptides

被引:6
作者
Medina-Ortiz, David [1 ,2 ]
Contreras, Seba [3 ]
Fernandez, Diego [1 ]
Soto-Garcia, Nicole [1 ]
Moya, Ivan [1 ,4 ]
Cabas-Mora, Gabriel [1 ]
Olivera-Nappa, Alvaro [2 ,5 ]
机构
[1] Univ Magallanes, Dept Ingn Comp, Punta Arenas 6210005, Chile
[2] Univ Chile, Ctr Biotechnol & Bioengn, CeBiB, Santiago 8370456, Chile
[3] Max Planck Inst Dynam & Self Org, Fassberg 17, D-37077 Gottingen, Germany
[4] Univ Magallanes, Dept Ingn Quim, Punta Arenas 6210005, Chile
[5] Univ Chile, Dept Ingn Quim Biotecnol & Mat, Santiago 8370456, Chile
关键词
antimicrobial peptides; machine learning; protein language models; generative learning; peptide discovery; peptide design; PREDICTION; CLASSIFICATION; DESIGN;
D O I
10.3390/ijms25168851
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Peptides are bioactive molecules whose functional versatility in living organisms has led to successful applications in diverse fields. In recent years, the amount of data describing peptide sequences and function collected in open repositories has substantially increased, allowing the application of more complex computational models to study the relations between the peptide composition and function. This work introduces AMP-Detector, a sequence-based classification model for the detection of peptides' functional biological activity, focusing on accelerating the discovery and de novo design of potential antimicrobial peptides (AMPs). AMP-Detector introduces a novel sequence-based pipeline to train binary classification models, integrating protein language models and machine learning algorithms. This pipeline produced 21 models targeting antimicrobial, antiviral, and antibacterial activity, achieving average precision exceeding 83%. Benchmark analyses revealed that our models outperformed existing methods for AMPs and delivered comparable results for other biological activity types. Utilizing the Peptide Atlas, we applied AMP-Detector to discover over 190,000 potential AMPs and demonstrated that it is an integrative approach with generative learning to aid in de novo design, resulting in over 500 novel AMPs. The combination of our methodology, robust models, and a generative design strategy offers a significant advancement in peptide-based drug discovery and represents a pivotal tool for therapeutic applications.
引用
收藏
页数:19
相关论文
共 83 条
[1]   In Silico Approach for Prediction of Antifungal Peptides [J].
Agrawal, Piyush ;
Bhalla, Sherry ;
Chaudhary, Kumardeep ;
Kumar, Rajesh ;
Sharma, Meenu ;
Raghava, Gajendra P. S. .
FRONTIERS IN MICROBIOLOGY, 2018, 9
[2]   Optuna: A Next-generation Hyperparameter Optimization Framework [J].
Akiba, Takuya ;
Sano, Shotaro ;
Yanase, Toshihiko ;
Ohta, Takeru ;
Koyama, Masanori .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :2623-2631
[3]   A Global Review on Short Peptides: Frontiers and Perspectives [J].
Apostolopoulos, Vasso ;
Bojarska, Joanna ;
Chai, Tsun-Thai ;
Elnagdy, Sherif ;
Kaczmarek, Krzysztof ;
Matsoukas, John ;
New, Roger ;
Parang, Keykavous ;
Lopez, Octavio Paredes ;
Parhiz, Hamideh ;
Perera, Conrad O. ;
Pickholz, Monica ;
Remko, Milan ;
Saviano, Michele ;
Skwarczynski, Mariusz ;
Tang, Yefeng ;
Wolf, Wojciech M. ;
Yoshiya, Taku ;
Zabrocki, Janusz ;
Zielenkiewicz, Piotr ;
AlKhazindar, Maha ;
Barriga, Vanessa ;
Kelaidonis, Konstantinos ;
Sarasia, Elham Mousavinezhad ;
Toth, Istvan .
MOLECULES, 2021, 26 (02)
[4]   AntiBP3: A Method for Predicting Antibacterial Peptides against Gram-Positive/Negative/Variable Bacteria [J].
Bajiya, Nisha ;
Choudhury, Shubham ;
Dhall, Anjali ;
Raghava, Gajendra P. S. .
ANTIBIOTICS-BASEL, 2024, 13 (02)
[5]   AntiVPP 1.0: A portable tool for prediction of antiviral peptides [J].
Beltran Lissabet, Jorge Felix ;
Herrera Belen, Lisandra ;
Farias, Jorge G. .
COMPUTERS IN BIOLOGY AND MEDICINE, 2019, 107 :127-130
[6]   Low-N protein engineering with data-efficient deep learning [J].
Biswas, Surojit ;
Khimulya, Grigory ;
Alley, Ethan C. ;
Esvelt, Kevin M. ;
Church, George M. .
NATURE METHODS, 2021, 18 (04) :389-+
[7]   Proteomic Screening for Prediction and Design of Antimicrobial Peptides with AmpGram [J].
Burdukiewicz, Michal ;
Sidorczuk, Katarzyna ;
Rafacz, Dominik ;
Pietluch, Filip ;
Chilimoniuk, Jaroslaw ;
Roediger, Stefan ;
Gagat, Przemyslaw .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (12) :1-13
[8]  
Cabas-Mora G, 2024, bioRxiv, DOI [10.1101/2024.07.11.603053, 10.1093/database/baae113, DOI 10.1101/2024.07.11.603053]
[9]   Computer-Aided Design of Antimicrobial Peptides: Are We Generating Effective Drug Candidates? [J].
Cardoso, Marlon H. ;
Orozco, Raquel Q. ;
Rezende, Samilla B. ;
Rodrigues, Gisele ;
Oshiro, Karen G. N. ;
Candido, Elizabete S. ;
Franco, Octavio L. .
FRONTIERS IN MICROBIOLOGY, 2020, 10
[10]   GM-Pep: A High Efficiency Strategy to De Novo Design Functional Peptide Sequences [J].
Chen, Qushuo ;
Yang, Changyan ;
Xie, Yihao ;
Wang, Yuqiang ;
Li, Xiaoxu ;
Wang, Kairong ;
Huang, Jinqi ;
Yan, Wenjin .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2022, 62 (10) :2617-2629