MIBiG 3.0: a community-driven effort to annotate experimentally validated biosynthetic gene clusters

被引:196
作者
Terlouw, Barbara R. [1 ]
Blin, Kai [2 ]
Navarro-Munoz, Jorge C. [1 ,3 ]
Avalon, Nicole E. [4 ]
Chevrette, Marc G. [5 ]
Egbert, Susan [6 ]
Lee, Sanghoon [7 ]
Meijer, David [1 ]
Recchia, Michael J. J. [7 ]
Reitz, Zachary L. [1 ]
van Santen, Jeffrey A. [7 ,8 ]
Selem-Mojica, Nelly [9 ]
Torring, Thomas [10 ]
Zaroubi, Liana [7 ]
Alanjary, Mohammad [1 ]
Aleti, Gajender [11 ]
Aguilar, Cesar [12 ]
Al-Salihi, Suhad A. A. [13 ]
Augustijn, Hannah E. [1 ,14 ]
Avelar-Rivas, J. Abraham [15 ]
Avitia-Dominguez, Luis A. [14 ,15 ]
Barona-Gomez, Francisco [14 ,15 ]
Bernaldo-Aguero, Jordan [16 ]
Bielinski, Vincent A. [17 ]
Biermann, Friederike [1 ,18 ,19 ]
Booth, Thomas J. [2 ,20 ]
Bravo, Victor J. Carrion [14 ,21 ,22 ]
Castelo-Branco, Raquel [23 ,24 ]
Chagas, Fernanda O. [25 ]
Cruz-Morales, Pablo [2 ]
Du, Chao [14 ]
Duncan, Katherine R. [26 ]
Gavriilidou, Athina [27 ,28 ]
Gayrard, Damien [29 ]
Gutierrez-Garcia, Karina [30 ]
Haslinger, Kristina [31 ]
Helfrich, Eric J. N. [18 ,19 ]
van der Hooft, Justin J. J. [1 ,32 ]
Jati, Afif P. [33 ]
Kalkreuter, Edward [34 ]
Kalyvas, Nikolaos [3 ]
Kang, Kyo B. [35 ]
Kautsar, Satria [34 ]
Kim, Wonyong [36 ]
Kunjapur, Aditya M. [37 ]
Li, Yong-Xin [38 ]
Lin, Geng-Min [39 ]
Loureiro, Catarina [40 ]
Louwen, Joris J. R. [1 ]
Louwen, Nico L. L. [1 ]
机构
[1] Wageningen Univ, Bioinformat Grp, NL-6708 PB Wageningen, Netherlands
[2] Tech Univ Denmark, Novo Nordisk Fdn Ctr Biosustainabil, Lyngby, Denmark
[3] Westerdijk Fungal Biodivers Inst, Uppsalalaan 8, NL-3584 CT Utrecht, Netherlands
[4] Univ Calif San Diego, Scripps Inst Oceanog, 9500 Gilman Dr, La Jolla, CA 92093 USA
[5] Univ Florida, Dept Microbiol & Cell Sci, Gainesville, FL 32611 USA
[6] Univ Manitoba, Dept Chem, 66 Chancellors Cir, Winnipeg, MB R3T 2N2, Canada
[7] Simon Fraser Univ, Dept Chem, 8888 Univ Dr, Columbia, BC V5A 1S6, Canada
[8] Unnat Prod, 2161 Delaware Ave,Suite A, Santa Cruz, CA 95060 USA
[9] Ctr Ciencias Matemat UNAM, Morelia, Michoacan, Mexico
[10] Aarhus Univ, Dept Biol & Chem Engn, Aarhus, Denmark
[11] Tennessee State Univ, Dept Agr & Environm Sci, Food & Anim Sci, Nashville, TN 37209 USA
[12] Purdue Univ, Dept Chem, W Lafayette, IN 47907 USA
[13] Univ Technol Baghdad, Dept Appl Sci, Baghdad, Iraq
[14] Leiden Univ, Inst Biol, Sylviusweg 72, NL-2333 BE Leiden, Netherlands
[15] Lab Nacl Geonm Biodiversidad Unidad Genom Avanzad, Km 9-6 Libramiento Norte Carretera Irapuato Leon, Irapuato 36824, Gto, Mexico
[16] Univ Nacl Autonoma Mexico, Inst Biotecnol, Dept Microbiol Mol, Cuernavaca, Morelos, Mexico
[17] J Craig Venter Inst, Synthet Biol & Bioenergy Grp, La Jolla, CA 92037 USA
[18] Goethe Univ Frankfurt, Inst Mol Bio Sci, D-60438 Frankfurt, Germany
[19] LOEWECtr Translat Biodivers Genom TBG, Senckenberganlage 25, D-60325 Frankfurt, Germany
[20] Univ Western Australia, Sch Mol Sci, Perth, WA, Australia
[21] Univ Malaga, Univ Malaga Consejo Super Invest Cient IHSM CSIC, Inst Hortofruticultura Subtrop & Mediterranea La, Dept Microbiol, Malaga, Spain
[22] Netherlands Inst Ecol NIOO KNAW, Dept Microbial Ecol, Wageningen, Netherlands
[23] Univ Porto, Interdisciplinary Ctr Marine & Environm Res CIIMA, Porto, Portugal
[24] Univ Porto, Fac Sci, P-4150179 Porto, Portugal
[25] Univ Federaldo Rio de Janeiro, Inst Pesquisas Prod Nat Walter Mors, BR-21941599 Rio De Janeiro, RJ, Brazil
[26] Univ Strathclyde, Strathclyde Inst Pharm & Biomed Sci, 141 Cathedral St, Glasgow G4 ORE, Lanark, Scotland
[27] Univ Tubingen, Interfac Inst Microbiol & Infect Med Tubingen IMI, Translat Genome Min Nat Prod, Tubingen, Germany
[28] Univ Tubingen, Interfac Inst Biomed Informat IBMI, Tubingen, Germany
[29] John Innes Ctr, Dept Mol Microbiol, Norwich Res Pk, Norwich NR4 7UH, Norfolk, England
[30] Carnegie Inst Sci, Dept Embryol, 3520 San Martin Dr, Baltimore, MD 21218 USA
[31] Univ Groningen, Groningen Res Inst Pharm, Dept Chem & Pharmaceut Biol, Antonius Deusinglaan 1, NL-9713 AV Groningen, Netherlands
[32] Univ Johannesburg, Dept Biochem, Auckland Pk, ZA-2006 Johannesburg, South Africa
[33] Indonesian Soc Bioinformat & Biodivers, Jakarta, Indonesia
[34] Univ Florida, Dept Chem, Scripps Biomed Res, 110 Scripps Way, Jupiter, FL 33458 USA
[35] Sookmyung Womens Univ, Coll Pharm, Seoul, South Korea
[36] Sunchon Natl Univ, Korean Lichen Res Inst, Sunchon, South Korea
[37] Univ Delaware, Dept Chem & Biomol Engn, Newark, DE 19716 USA
[38] Univ Hong Kong, Dept Chem, Pokfulam Rd, Hong Kong, Peoples R China
[39] MIT, Dept Biol Engn, Cambridge, MA USA
[40] Wageningen Univ, Lab Microbiol, Stippeneng 4, NL-6708 WE Wageningen, Netherlands
[41] Rothamsted Res, Sustainable Soils & Crops, Harpenden, Herts, England
[42] Univ Costa Rica, Fac Farm, Inst Invest Farmaceut INIFAR, San Jose 115012060, Costa Rica
[43] Univ Costa Rica, Ctr Invest Prod Nat CIPRONA, San Jose 115012060, Costa Rica
[44] CeNAT CONARE, Ctr Nacl Innovac Biotecnol CENIBiot, San Jose 11741200, Costa Rica
[45] Oregon State Univ, Dept Pharmaceut Sci, Corvallis, OR 97331 USA
[46] Univ Porto, Inst Biomed Sci Abel Salazar ICBAS, Porto, Portugal
[47] Yenepoya Deemed Univ, Ctr Integrat Omics Data Sci, Mangalore 575018, India
[48] Eawag Swiss Fed Inst Aquat Sci & Technol, Dept Environm Microbiol, Uberlandstr 133, CH-8600 Dubendorf, Switzerland
[49] Univ Nottingham, Sch Chem, Univ Pk, Nottingham NG7 2RD, England
[50] Inst Chem Biol, Shenzhen Bay Lab, Shenzhen 518132, Peoples R China
基金
欧盟地平线“2020”; 美国国家卫生研究院; 英国生物技术与生命科学研究理事会; 美国国家科学基金会; 新加坡国家研究基金会; 欧洲研究理事会;
关键词
INFORMATION; DIVERSITY;
D O I
10.1093/nar/gkac1049
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
With an ever-increasing amount of (meta)genomic data being deposited in sequence databases, (meta)genome mining for natural product biosynthetic pathways occupies a critical role in the discovery of novel pharmaceutical drugs, crop protection agents and biomaterials. The genes that encode these pathways are often organised into biosynthetic gene clusters (BGCs). In 2015, we defined the Minimum Information about a Biosynthetic Gene cluster (MIBiG): a standardised data format that describes the minimally required information to uniquely characterise a BGC. We simultaneously constructed an accompanying online database of BGCs, which has since been widely used by the community as a reference dataset for BGCs and was expanded to 2021 entries in 2019 (MIBiG 2.0). Here, we describe MIBiG 3.0, a database update comprising large-scale validation and re-annotation of existing entries and 661 new entries. Particular attention was paid to the annotation of compound structures and biological activities, as well as protein domain selectivities. Together, these new features keep the database upto-date, and will provide new opportunities for the scientific community to use its freely available data, e.g. for the training of new machine learning models to predict sequence-structure-function relationships for diverse natural products. MIBiG 3.0 is accessible online at https://mibig.secondarymetabolites.org/. [GRAPHICS] .
引用
收藏
页码:D603 / D610
页数:8
相关论文
共 25 条
  • [1] RiPPMiner: a bioinformatics resource for deciphering chemical structures of RiPPs based on prediction of cleavage and cross-links
    Agrawal, Priyesh
    Khater, Shradha
    Gupta, Money
    Sain, Neetu
    Mohanty, Debasisa
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (W1) : W80 - W88
  • [2] antiSMASH 6.0: improving cluster detection and comparison capabilities
    Blin, Kai
    Shaw, Simon
    Kloosterman, Alexander M.
    Charlop-Powers, Zach
    van Wezel, Gilles P.
    Medema, Marnix H.
    Weber, Tilmann
    [J]. NUCLEIC ACIDS RESEARCH, 2021, 49 (W1) : W29 - W35
  • [3] Carroll LM, 2000, BIOINFORM, DOI [10.1101/2021.05.03.442509, DOI 10.1101/2021.05.03.442509]
  • [4] SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria
    Chevrette, Marc G.
    Aicheler, Fabian
    Kohlbacher, Oliver
    Currie, Cameron R.
    Medema, Marnix H.
    [J]. BIOINFORMATICS, 2017, 33 (20) : 3202 - 3210
  • [5] The ChEMBL database in 2017
    Gaulton, Anna
    Hersey, Anne
    Nowotka, Michal
    Bento, A. Patricia
    Chambers, Jon
    Mendez, David
    Mutowo, Prudence
    Atkinson, Francis
    Bellis, Louisa J.
    Cibrian-Uhalte, Elena
    Davies, Mark
    Dedman, Nathan
    Karlsson, Anneli
    Magarinos, Maria Paula
    Overington, John P.
    Papadatos, George
    Smit, Ines
    Leach, Andrew R.
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D945 - D954
  • [6] Compendium of specialized metabolite biosynthetic diversity encoded in bacterial genomes
    Gavriilidou, Athina
    Kautsar, Satria A.
    Zaburannyi, Nestor
    Krug, Daniel
    Mueller, Rolf
    Medema, Marnix H.
    Ziemert, Nadine
    [J]. NATURE MICROBIOLOGY, 2022, 7 (05) : 726 - +
  • [7] A deep learning genome-mining strategy for biosynthetic gene cluster prediction
    Hannigan, Geoffrey D.
    Prihoda, David
    Palicka, Andrej
    Soukup, Jindrich
    Klempir, Ondrej
    Rampula, Lena
    Durcak, Jindrich
    Wurst, Michael
    Kotowski, Jakub
    Chang, Dan
    Wang, Rurun
    Piizzi, Grazia
    Temesi, Gergely
    Hazuda, Daria J.
    Woelk, Christopher H.
    Bitton, Danny A.
    [J]. NUCLEIC ACIDS RESEARCH, 2019, 47 (18)
  • [8] Structures of a non-ribosomal peptide synthetase condensation domain suggest the basis of substrate selectivity
    Izore, Thierry
    Ho, Y. T. Candace
    Kaczmarski, Joe A.
    Gavriilidou, Athina
    Chow, Ka Ho
    Steer, David L.
    Goode, Robert J. A.
    Schittenhelm, Ralf B.
    Tailhades, Julien
    Tosin, Manuela
    Challis, Gregory L.
    Krenske, Elizabeth H.
    Ziemert, Nadine
    Jackson, Colin J.
    Cryle, Max J.
    [J]. NATURE COMMUNICATIONS, 2021, 12 (01)
  • [9] MIBiG 2.0: a repository for biosynthetic gene clusters of known function
    Kautsar, Satria A.
    Blin, Kai
    Shaw, Simon
    Navarro-Munoz, Jorge C.
    Terlouw, Barbara R.
    van der Hooft, Justin J. J.
    van Santen, Jeffrey A.
    Tracanna, Vittorio
    Duran, Hernando G. Suarez
    Andreu, Victoria Pascal
    Selem-Mojica, Nelly
    Alanjary, Mohammad
    Robinson, Serina L.
    Lund, George
    Epstein, Samuel C.
    Sisto, Ashley C.
    Charkoudian, Louise
    Collemare, Jerome
    Linington, Roger G.
    Weber, Tilmann
    Medema, Marnix H.
    [J]. NUCLEIC ACIDS RESEARCH, 2020, 48 (D1) : D454 - D458
  • [10] plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters
    Kautsar, Satria A.
    Duran, Hernando G. Suarez
    Blin, Kai
    Osbourn, Anne
    Medema, Marnix H.
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (W1) : W55 - W63