subMG automates data submission for metagenomics studies

被引:0
作者
Tubbesing, Tom [1 ]
Schlueter, Andreas [1 ]
Sczyrba, Alexander [1 ,2 ]
机构
[1] Bielefeld Univ, Ctr Biotechnol CeBiTec, Computat Metagen Grp, Univ Str 27, D-33615 Bielefeld, Germany
[2] Forschungszentrum Julich, Inst Bio & Geosci IBG, IBG 5 Computat Metagen, Ctr Biotechnol CeBiTec, D-33594 Bielefeld, Germany
关键词
Metagenomics; European Nucleotide Archive; Submission; FAIR; Metadata;
D O I
10.1186/s13040-025-00453-w
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
BackgroundPublicly available metagenomics datasets are crucial for ensuring the reproducibility of scientific findings and supporting contemporary large-scale studies. However, submitting a comprehensive metagenomics dataset is both cumbersome and time-consuming. It requires including sample information, sequencing reads, assemblies, binned contigs, metagenome-assembled genomes (MAGs), and appropriate metadata. As a result, metagenomics studies are often published with incomplete datasets or, in some cases, without any data at all. subMG addresses this challenge by simplifying and automating the data submission process, thereby encouraging broader and more consistent data sharing.ResultssubMG streamlines the process of submitting metagenomics study results to the European Nucleotide Archive (ENA) by allowing researchers to input files and metadata from their studies in a single form and automating downstream tasks that otherwise require extensive manual effort and expertise. The tool comes with comprehensive documentation as well as example data tailored for different use cases and can be operated via the command-line or a graphical user interface (GUI), making it easily deployable to a wide range of potential users.ConclusionsBy simplifying the submission of genome-resolved metagenomics study datasets, subMG significantly reduces the time, effort, and expertise required from researchers, thus paving the way for more numerous and comprehensive data submissions in the future. An increased availability of well-documented and FAIR data can benefit future research, particularly in meta-analyses and comparative studies.
引用
收藏
页数:7
相关论文
共 12 条
[1]   Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea [J].
Bowers, Robert M. ;
Kyrpides, Nikos C. ;
Stepanauskas, Ramunas ;
Harmon-Smith, Miranda ;
Doud, Devin ;
Reddy, T. B. K. ;
Schulz, Frederik ;
Jarett, Jessica ;
Rivers, Adam R. ;
Eloe-Fadrosh, Emiley A. ;
Tringe, Susannah G. ;
Ivanova, Natalia N. ;
Copeland, Alex ;
Clum, Alicia ;
Becraft, Eric D. ;
Malmstrom, Rex R. ;
Birren, Bruce ;
Podar, Mircea ;
Bork, Peer ;
Weinstock, George M. ;
Garrity, George M. ;
Dodsworth, Jeremy A. ;
Yooseph, Shibu ;
Sutton, Granger ;
Gloeckner, Frank O. ;
Gilbert, Jack A. ;
Nelson, William C. ;
Hallam, Steven J. ;
Jungbluth, Sean P. ;
Ettema, Thijs J. G. ;
Tighe, Scott ;
Konstantinidis, Konstantinos T. ;
Liu, Wen-Tso ;
Baker, Brett J. ;
Rattei, Thomas ;
Eisen, Jonathan A. ;
Hedlund, Brian ;
McMahon, Katherine D. ;
Fierer, Noah ;
Knight, Rob ;
Finn, Rob ;
Cochrane, Guy ;
Karsch-Mizrachi, Ilene ;
Tyson, Gene W. ;
Rinke, Christian ;
Lapidus, Alla ;
Meyer, Folker ;
Yilmaz, Pelin ;
Parks, Donovan H. ;
Eren, A. M. .
NATURE BIOTECHNOLOGY, 2017, 35 (08) :725-731
[2]   GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database [J].
Chaumeil, Pierre-Alain ;
Mussig, Aaron J. ;
Hugenholtz, Philip ;
Parks, Donovan H. .
BIOINFORMATICS, 2020, 36 (06) :1925-1927
[3]   Every fifth published metagenome is not available to science [J].
Eckert, Ester M. ;
Di Cesare, Andrea ;
Fontaneto, Diego ;
Berendonk, Thomas U. ;
Buergmann, Helmut ;
Cytryn, Eddie ;
Fatta-Kassinos, Despo ;
Franzetti, Andrea ;
Larsson, D. G. Joakim ;
Manaia, Celia M. ;
Pruden, Amy ;
Singer, Andrew C. ;
Udikovic-Kolic, Nikolina ;
Corno, Gianluca .
PLOS BIOLOGY, 2020, 18 (04)
[4]  
European Nucleotide Archive, 2025, ENA data submission: submitting metagenome assemblies
[5]  
European Nucleotide Archive, 2025, Webin command line submission interface (Webin-CLI)
[6]  
Genomic Standards Consortium, 2025, Standards Introduction
[7]   EMBL2checklists: A Python']Python package to facilitate the user-friendly submission of plant and fungal DNA barcoding sequences to ENA [J].
Gruenstaeudl, Michael ;
Hartmaring, Yannick .
PLOS ONE, 2019, 14 (01)
[8]   CheckM: assessing the quality of microbial genomes recovered from isolates, single cells, and metagenomes [J].
Parks, Donovan H. ;
Imelfort, Michael ;
Skennerton, Connor T. ;
Hugenholtz, Philip ;
Tyson, Gene W. .
GENOME RESEARCH, 2015, 25 (07) :1043-1055
[9]   METAGENOTE: a simplified web platform for metadata annotation of genomic samples and streamlined submission to NCBI's sequence read archive [J].
Quinones, Mariam ;
Liou, David T. ;
Shyu, Conrad ;
Kim, Wongyu ;
Vujkovic-Cvijin, Ivan ;
Belkaid, Yasmine ;
Hurt, Darrell E. .
BMC BIOINFORMATICS, 2020, 21 (01)
[10]   ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level [J].
Rocca-Serra, Philippe ;
Brandizi, Marco ;
Maguire, Eamonn ;
Sklyar, Nataliya ;
Taylor, Chris ;
Begley, Kimberly ;
Field, Dawn ;
Harris, Stephen ;
Hide, Winston ;
Hofmann, Oliver ;
Neumann, Steffen ;
Sterk, Peter ;
Tong, Weida ;
Sansone, Susanna-Assunta .
BIOINFORMATICS, 2010, 26 (18) :2354-2356