A computational framework to explore large-scale biosynthetic diversity

被引:586
作者
Navarro-Munoz, Jorge C. [1 ,2 ]
Selem-Mojica, Nelly [3 ]
Mullowney, Michael W. [4 ]
Kautsar, Satria A. [1 ]
Tryon, James H. [4 ]
Parkinson, Elizabeth, I [5 ,6 ,12 ]
De Los Santos, Emmanuel L. C. [7 ]
Yeong, Marley [1 ]
Cruz-Morales, Pablo [3 ]
Abubucker, Sahar [8 ,13 ]
Roeters, Arne [1 ]
Lokhorst, Wouter [1 ]
Fernandez-Guerra, Antonio [9 ,10 ,11 ]
Cappelini, Luciana Teresa Dias [4 ]
Goering, Anthony W. [4 ]
Thomson, Regan J. [4 ]
Metcalf, William W. [5 ,6 ]
Kelleher, Neil L. [4 ]
Barona-Gomez, Francisco [3 ]
Medema, Marnix H. [1 ]
机构
[1] Wageningen Univ, Bioinformat Grp, Wageningen, Netherlands
[2] Westerdijk Fungal Biodivers Inst, Fungal Nat Prod Grp, Utrecht, Netherlands
[3] Cinvestav IPN, Unidad Genom Avanzada Langebio, Evolut Metab Divers Lab, Irapuato, Mexico
[4] Northwestern Univ, Dept Chem, Evanston, IL 60208 USA
[5] Univ Illinois, Carl R Woese Inst Genom Biol, Urbana, IL USA
[6] Univ Illinois, Dept Microbiol, Urbana, IL USA
[7] Univ Warwick, Warwick Integrat Synthet Biol Ctr, Coventry, W Midlands, England
[8] Novartis Inst BioMed Res, Cambridge, MA USA
[9] Max Planck Inst Marine Microbiol, Microbial Genom & Bioinformat, Bremen, Germany
[10] Univ Copenhagen, Lundbeck Fdn GeoGenet Ctr, GLOBE Inst, Copenhagen, Denmark
[11] Univ Bremen, Ctr Marine Environm Sci, Bremen, Germany
[12] Purdue Univ, Dept Chem, W Lafayette, IN 47907 USA
[13] Sanofi, Cambridge, MA USA
基金
欧盟地平线“2020”; 芬兰科学院; 英国生物技术与生命科学研究理事会; 巴西圣保罗研究基金会; 美国国家卫生研究院; 英国工程与自然科学研究理事会;
关键词
MULTIPLE SEQUENCE ALIGNMENT; COMPLETE GENOME SEQUENCE; NONRIBOSOMAL PEPTIDE; NATURAL-PRODUCTS; GENE CLUSTERS; MOLECULAR NETWORKING; SECONDARY METABOLISM; DETOXIN COMPLEX; DISCOVERY; EVOLUTION;
D O I
10.1038/s41589-019-0400-9
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Genome mining has become a key technology to exploit natural product diversity. Although initially performed on a single-genome basis, the process is now being scaled up to mine entire genera, strain collections and microbiomes. However, no bioinformatic framework is currently available for effectively analyzing datasets of this size and complexity. In the present study, a streamlined computational workflow is provided, consisting of two new software tools: the 'biosynthetic gene similarity clustering and prospecting engine' (BiG-SCAPE), which facilitates fast and interactive sequence similarity network analysis of biosynthetic gene clusters and gene cluster families; and the 'core analysis of syntenic orthologues to prioritize natural product gene clusters' (CORASON), which elucidates phylogenetic relationships within and across these families. BiG-SCAPE is validated by correlating its output to metabolomic data across 363 actinobacterial strains and the discovery potential of CORASON is demonstrated by comprehensively mapping biosynthetic diversity across a range of detoxin/rimosamide-related gene cluster families, culminating in the characterization of seven detoxin analogues.
引用
收藏
页码:60 / +
页数:13
相关论文
共 58 条
[11]  
Csardi G., 2006, The igraph software package for complex network research, V1695, P1
[12]   Specialized microbial metabolites: functions and origins [J].
Davies, Julian .
JOURNAL OF ANTIBIOTICS, 2013, 66 (07) :361-364
[13]   Polyketide and nonribosomal peptide retro-biosynthesis and global gene cluster matching [J].
Dejong, Chris A. ;
Chen, Gregory M. ;
Li, Haoxin ;
Johnston, Chad W. ;
Edwards, Mclean R. ;
Rees, Philip N. ;
Skinnider, Michael A. ;
Webster, Andrew L. H. ;
Magarvey, Nathan A. .
NATURE CHEMICAL BIOLOGY, 2016, 12 (12) :1007-+
[14]   MS/MS networking guided analysis of molecule and gene cluster families [J].
Don Duy Nguyen ;
Wu, Cheng-Hsuan ;
Moree, Wilna J. ;
Lamsa, Anne ;
Medema, Marnix H. ;
Zhao, Xiling ;
Gavilan, Ronnie G. ;
Aparicio, Marystella ;
Atencio, Librada ;
Jackson, Chanaye ;
Ballesteros, Javier ;
Sanchez, Joel ;
Watrous, Jeramie D. ;
Phelan, Vanessa V. ;
van de Wiel, Corine ;
Kersten, Roland D. ;
Mehnaz, Samina ;
De Mot, Rene ;
Shank, Elizabeth A. ;
Charusanti, Pep ;
Nagarajan, Harish ;
Duggan, Brendan M. ;
Moore, Bradley S. ;
Bandeira, Nuno ;
Palsson, Bernhard O. ;
Pogliano, Kit ;
Gutierrez, Marcelino ;
Dorrestein, Pieter C. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (28) :E2611-E2620
[15]  
Doroghazi JR, 2014, NAT CHEM BIOL, V10, P963, DOI [10.1038/NCHEMBIO.1659, 10.1038/nchembio.1659]
[16]   Molecular Networking and Pattern-Based Genome Mining Improves Discovery of Biosynthetic Gene Clusters and their Products from Salinispora Species [J].
Duncan, Katherine R. ;
Cruesemann, Max ;
Lechner, Anna ;
Sarkar, Anindita ;
Li, Jie ;
Ziemert, Nadine ;
Wang, Mingxun ;
Bandeira, Nuno ;
Moore, Bradley S. ;
Dorrestein, Pieter C. ;
Jensen, Paul R. .
CHEMISTRY & BIOLOGY, 2015, 22 (04) :460-471
[17]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[18]   The evolution of gene collectives: How natural selection drives chemical innovation [J].
Fischbach, Michael A. ;
Walsh, Christopher T. ;
Clardy, Jon .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2008, 105 (12) :4601-4608
[19]   Metagenome Mining Reveals Polytheonamides as Posttranslationally Modified Ribosomal Peptides [J].
Freeman, Michael F. ;
Gurgui, Cristian ;
Helf, Maximilian J. ;
Morinaka, Brandon I. ;
Uria, Agustinus R. ;
Oldham, Neil J. ;
Sahl, Hans-Georg ;
Matsunaga, Shigeki ;
Piel, Joern .
SCIENCE, 2012, 338 (6105) :387-390
[20]   Clustering by passing messages between data points [J].
Frey, Brendan J. ;
Dueck, Delbert .
SCIENCE, 2007, 315 (5814) :972-976