DeepRiPP integrates multiomics data to automate discovery of novel ribosomally synthesized natural products

被引:108
作者
Merwin, Nishanth J. [1 ]
Mousa, Walaa K. [2 ,3 ]
Dejong, Chris A. [4 ]
Skinnider, Michael A. [5 ]
Cannon, Michael J. [1 ]
Li, Haoxin [4 ]
Dial, Keshav [1 ]
Gunabalasingam, Mathusan [1 ]
Johnston, Chad [6 ,7 ]
Magarvey, Nathan A. [1 ]
机构
[1] McMaster Univ, Dept Biochem & Biomed Sci, Hamilton, ON L8S 4L8, Canada
[2] McMaster Univ, Dept Med, Hamilton, ON L8S 4L8, Canada
[3] Mansoura Univ, Sch Pharm, Dept Pharmacognosy, Dakahlia 35516, Egypt
[4] Adapsyn Biosci, Hamilton, ON L8P 0A1, Canada
[5] Univ British Columbia, Michael Smith Labs, Vancouver, BC V6T 1Z4, Canada
[6] MIT, Inst Med Engn & Sci, Cambridge, MA 02142 USA
[7] MIT, Dept Biol Engn, Cambridge, MA 02142 USA
基金
加拿大自然科学与工程研究理事会;
关键词
natural products; RiPPs; genome mining; machine learning; metabolomics; HETEROLOGOUS EXPRESSION; CHEMICAL-STRUCTURES; MASS-SPECTROMETRY; GENOME; SEARCH; PREDICTION; BIOSYNTHESIS; ANTIBIOTICS; METABOLISM; POLYKETIDE;
D O I
10.1073/pnas.1901493116
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Microbial natural products represent a rich resource of evolved chemistry that forms the basis for the majority of pharmacother-apeutics. Ribosomally synthesized and posttranslationally modified peptides (RiPPs) are a particularly interesting class of natural products noted for their unique mode of biosynthesis and biological activities. Analyses of sequenced microbial genomes have revealed an enormous number of biosynthetic loci encoding RiPPs but whose products remain cryptic. In parallel, analyses of bacterial metabolomes typically assign chemical structures to only a minority of detected metabolites. Aligning these 2 disparate sources of data could provide a comprehensive strategy for natural product discovery. Here we present DeepRiPP, an integrated genomic and metabolomic platform that employs machine learning to automate the selective discovery and isolation of novel RiPPs. DeepRiPP includes 3 modules. The first, NLPPrecursor, identifies RiPPs independent of genomic context and neighboring biosynthetic genes. The second module, BARLEY, prioritizes loci that encode novel compounds, while the third, CLAMS, automates the isolation of their corresponding products from complex bacterial extracts. DeepRiPP pinpoints target metabolites using large-scale comparative metabolomics analysis across a database of 10,498 extracts generated from 463 strains. We apply the DeepRiPP platform to expand the landscape of novel RiPPs encoded within sequenced genomes and to discover 3 novel RiPPs, whose structures are exactly as predicted by our platform. By building on advances in machine learning technologies, DeepRiPP integrates genomic and metabolomic data to guide the isolation of novel RiPPs in an automated manner.
引用
收藏
页码:371 / 380
页数:10
相关论文
共 68 条
[1]   RiPPMiner: a bioinformatics resource for deciphering chemical structures of RiPPs based on prediction of cleavage and cross-links [J].
Agrawal, Priyesh ;
Khater, Shradha ;
Gupta, Money ;
Sain, Neetu ;
Mohanty, Debasisa .
NUCLEIC ACIDS RESEARCH, 2017, 45 (W1) :W80-W88
[2]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[3]   Deep learning for computational biology [J].
Angermueller, Christof ;
Parnamaa, Tanel ;
Parts, Leopold ;
Stegle, Oliver .
MOLECULAR SYSTEMS BIOLOGY, 2016, 12 (07)
[4]  
[Anonymous], 2013, PREPRINT ARXIV 1308
[5]  
[Anonymous], 2014, NEURIPS
[6]  
[Anonymous], 2015, ARXIV151101432
[7]  
[Anonymous], BIORXIV101101445270
[8]   NRPS-PKS: a knowledge-based resource for analysis of NRPS/PKS megasynthases [J].
Ansari, MZ ;
Yadav, G ;
Gokhale, RS ;
Mohanty, D .
NUCLEIC ACIDS RESEARCH, 2004, 32 :W405-W413
[9]   Ribosomally synthesized and post-translationally modified peptide natural products: overview and recommendations for a universal nomenclature [J].
Arnison, Paul G. ;
Bibb, Mervyn J. ;
Bierbaum, Gabriele ;
Bowers, Albert A. ;
Bugni, Tim S. ;
Bulaj, Grzegorz ;
Camarero, Julio A. ;
Campopiano, Dominic J. ;
Challis, Gregory L. ;
Clardy, Jon ;
Cotter, Paul D. ;
Craik, David J. ;
Dawson, Michael ;
Dittmann, Elke ;
Donadio, Stefano ;
Dorrestein, Pieter C. ;
Entian, Karl-Dieter ;
Fischbach, Michael A. ;
Garavelli, John S. ;
Goeransson, Ulf ;
Gruber, Christian W. ;
Haft, Daniel H. ;
Hemscheidt, Thomas K. ;
Hertweck, Christian ;
Hill, Colin ;
Horswill, Alexander R. ;
Jaspars, Marcel ;
Kelly, Wendy L. ;
Klinman, Judith P. ;
Kuipers, Oscar P. ;
Link, A. James ;
Liu, Wen ;
Marahiel, Mohamed A. ;
Mitchell, Douglas A. ;
Moll, Gert N. ;
Moore, Bradley S. ;
Mueller, Rolf ;
Nair, Satish K. ;
Nes, Ingolf F. ;
Norris, Gillian E. ;
Olivera, Baldomero M. ;
Onaka, Hiroyasu ;
Patchett, Mark L. ;
Piel, Joern ;
Reaney, Martin J. T. ;
Rebuffat, Sylvie ;
Ross, R. Paul ;
Sahl, Hans-Georg ;
Schmidt, Eric W. ;
Selsted, Michael E. .
NATURAL PRODUCT REPORTS, 2013, 30 (01) :108-160
[10]  
Bahdanau D, 2014, 3 INT C LEARN REPR