Challenges in computational discovery of bioactive peptides in 'omics data

被引:9
作者
Coelho, Luis Pedro [1 ,2 ]
Santos-Jr, Celio Dias [2 ,3 ]
de la Fuente-nunez, Cesar [4 ,5 ,6 ,7 ,8 ,9 ,10 ]
机构
[1] Queensland Univ Technol, Ctr Microbiome Res, Sch Biomed Sci, Woolloongabba, Qld, Australia
[2] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence IST, Shanghai, Peoples R China
[3] Fed Univ Sao Carlos UFSCar, Hydrobiol Dept, Lab Microbial Proc & Biodivers LMPB, Sao Paulo, Brazil
[4] Univ Penn, Machine Biol Grp Dept Psychiat Inst Biomed Informa, Inst Translat Med & Therapeut, Machine Biol Grp,Dept Psychiat,Inst Biomed Informa, Philadelphia, PA USA
[5] Univ Penn, Inst Translat Med & Therapeut, Perelman Sch Med, Machine Biol Grp,Dept Microbiol,Inst Biomed Inform, Philadelphia, PA USA
[6] Univ Penn, Sch Engn & Appl Sci, Dept Bioengn, Philadelphia, PA USA
[7] Univ Penn, Sch Engn & Appl Sci, Dept Chem & Biomol Engn, Philadelphia, PA USA
[8] Univ Penn, Sch Arts & Sci, Dept Chem, Philadelphia, PA USA
[9] Univ Penn, Penn Inst Computat Sci, Philadelphia, PA USA
[10] Univ Penn, Machine Biol Grp, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院; 澳大利亚研究理事会;
关键词
bioinformatics; biomedicine; data mining < bioinformatics; diseases < biomedicine; infectious; OPEN READING FRAMES; BACTERIAL GENES; PREDICTION; ANNOTATION; REPOSITORY; IDENTIFY; ORFS; SORFS.ORG; PACKAGE;
D O I
10.1002/pmic.202300105
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Peptides have a plethora of activities in biological systems that can potentially be exploited biotechnologically. Several peptides are used clinically, as well as in industry and agriculture. The increase in available 'omics data has recently provided a large opportunity for mining novel enzymes, biosynthetic gene clusters, and molecules. While these data primarily consist of DNA sequences, other types of data provide important complementary information. Due to their size, the approaches proven successful at discovering novel proteins of canonical size cannot be naively applied to the discovery of peptides. Peptides can be encoded directly in the genome as short open reading frames (smORFs), or they can be derived from larger proteins by proteolysis. Both of these peptide classes pose challenges as simple methods for their prediction result in large numbers of false positives. Similarly, functional annotation of larger proteins, traditionally based on sequence similarity to infer orthology and then transferring functions between characterized proteins and uncharacterized ones, cannot be applied for short sequences. The use of these techniques is much more limited and alternative approaches based on machine learning are used instead. Here, we review the limitations of traditional methods as well as the alternative methods that have recently been developed for discovering novel bioactive peptides with a focus on prokaryotic genomes and metagenomes.
引用
收藏
页数:9
相关论文
共 103 条
[1]   Emerging Computational Approaches for Antimicrobial Peptide Discovery [J].
Aguero-Chapin, Guillermin ;
Galpert-Canizares, Deborah ;
Dominguez-Perez, Dany ;
Marrero-Ponce, Yovani ;
Perez-Machado, Gisselle ;
Teijeira, Marta ;
Antunes, Agostinho .
ANTIBIOTICS-BASEL, 2022, 11 (07)
[2]   A new genomic blueprint of the human gut microbiota [J].
Almeida, Alexandre ;
Mitchell, Alex L. ;
Boland, Miguel ;
Forster, Samuel C. ;
Gloor, Gregory B. ;
Tarkowska, Aleksandra ;
Lawley, Trevor D. ;
Finn, Robert D. .
NATURE, 2019, 568 (7753) :499-+
[3]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[4]   The dynamic landscape of peptide activity prediction [J].
Barcenas, Oriol ;
Pintado-Grima, Carlos ;
Sidorczuk, Katarzyna ;
Teufel, Felix ;
Nielsen, Henrik ;
Ventura, Salvador ;
Burdukiewicz, Michal .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2022, 20 :6526-6533
[5]   Optimized Proteomics Workflow for the Detection of Small Proteins [J].
Bartel, Juergen ;
Varadarajan, Adithi R. ;
Sura, Thomas ;
Ahrens, Christian H. ;
Maass, Sandra ;
Becher, Doerte .
JOURNAL OF PROTEOME RESEARCH, 2020, 19 (10) :4004-4018
[6]   Machine intelligence in peptide therapeutics: A next-generation tool for rapid disease screening [J].
Basith, Shaherin ;
Manavalan, Balachandran ;
Shin, Tae Hwan ;
Lee, Gwang .
MEDICINAL RESEARCH REVIEWS, 2020, 40 (04) :1276-1314
[7]   AmPEP: Sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest [J].
Bhadra, Pratiti ;
Yan, Jielu ;
Li, Jinyan ;
Fong, Simon ;
Siu, Shirley W. I. .
SCIENTIFIC REPORTS, 2018, 8
[8]   Fast and sensitive protein alignment using DIAMOND [J].
Buchfink, Benjamin ;
Xie, Chao ;
Huson, Daniel H. .
NATURE METHODS, 2015, 12 (01) :59-60
[9]   eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale [J].
Cantalapiedra, Carlos P. ;
Hernandez-Plaza, Ana ;
Letunic, Ivica ;
Bork, Peer ;
Huerta-Cepas, Jaime .
MOLECULAR BIOLOGY AND EVOLUTION, 2021, 38 (12) :5825-5829
[10]   Synthetic Antibiotic Derived from Sequences Encrypted in a Protein from Human Plasma [J].
Cesaro, Angela ;
Torres, Marcelo D. T. ;
Gaglione, Rosa ;
Dell'Olmo, Eliana ;
Di Girolamo, Rocco ;
Bosso, Andrea ;
Pizzo, Elio ;
Haagsman, Henk P. ;
Veldhuizen, Edwin J. A. ;
de la Fuente-Nunez, Cesar ;
Arciello, Angela .
ACS NANO, 2022, 16 (02) :1880-1895