Challenges and opportunities in sharing microbiome data and analyses

被引:15
作者
Huttenhower, Curtis [1 ,2 ,3 ,4 ]
Finn, Robert D. [5 ]
Mchardy, Alice Carolyn [6 ,7 ]
机构
[1] Harvard TH Chan Sch Publ Hlth, Harvard Chan Microbiome Publ Hlth Ctr, Boston, MA 02115 USA
[2] Harvard TH Chan Sch Publ Hlth, Dept Biostat, Boston, MA 02115 USA
[3] Harvard TH Chan Sch Publ Hlth, Dept Immunol & Infect Dis, Boston, MA 02115 USA
[4] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[5] European Bioinformat Inst, European Mol Biol Lab, Wellcome Genome Campus, Cambridge, England
[6] Helmholtz Ctr Infect Res, Computat Biol Infect Res, Braunschweig, Germany
[7] Tech Univ Carolo Wilhelmina Braunschweig, Braunschweig Integrated Ctr Syst Biol BRICS, Braunschweig, Germany
关键词
MINIMUM INFORMATION; WIDE ASSOCIATION; DATA-MANAGEMENT; QUALITY-CONTROL; STANDARDS; RESOURCE; TOOLS; DATABASES; PLATFORM; CONDUCT;
D O I
10.1038/s41564-023-01484-x
中图分类号
Q93 [微生物学];
学科分类号
071005 ; 100705 ;
摘要
Microbiome data, metadata and analytical workflows have become 'big' in terms of volume and complexity. Although the infrastructure and technologies to share data have been established, the interdisciplinary and multi-omic nature of the field can make resources difficult to identify and use. Following best practices for data deposition requires substantial effort, with sometimes little obvious reward. Gaps remain where microbiome-specific resources for data sharing or reproducibility do not yet exist. We outline available best practices, challenges to their adoption and opportunities in data sharing in microbiome research. We showcase examples of best practices and advocate for their enforcement and incentivization for data sharing. This includes recognition of data curation and sharing endeavours by individuals, institutions, journals and funders. Opportunities for progress include enabling microbiome-specific databases to incorporate future methods for data analysis, integration and reuse. An outline of best practices, resources and suggested improvements to ensure that the complexities inherent to microbial big data do not hamper accessibility.
引用
收藏
页码:1960 / 1970
页数:11
相关论文
共 114 条
  • [1] Agafonov Aleksandr, 2017, F1000Res, V6, DOI 10.12688/f1000research.13204.1
  • [2] Allan C, 2012, NAT METHODS, V9, P245, DOI [10.1038/NMETH.1896, 10.1038/nmeth.1896]
  • [3] Alper J., 2018, AN MOD MICR RES ADV
  • [4] [Anonymous], 2017, Environmental Chemicals, the Human Microbiome, and Health Risk: A Research Strategy
  • [5] KBase: The United States Department of Energy Systems Biology Knowledgebase
    Arkin, Adam P.
    Cottingham, Robert W.
    Henry, Christopher S.
    Harris, Nomi L.
    Stevens, Rick L.
    Maslov, Sergei
    Dehal, Paramvir
    Ware, Doreen
    Perez, Fernando
    Canon, Shane
    Sneddon, Michael W.
    Henderson, Matthew L.
    Riehl, William J.
    Murphy-Olson, Dan
    Chan, Stephen Y.
    Kamimura, Roy T.
    Kumari, Sunita
    Drake, Meghan M.
    Brettin, Thomas S.
    Glass, Elizabeth M.
    Chivian, Dylan
    Gunter, Dan
    Weston, David J.
    Allen, Benjamin H.
    Baumohl, Jason
    Best, Aaron A.
    Bowen, Ben
    Brenner, Steven E.
    Bun, Christopher C.
    Chandonia, John-Marc
    Chia, Jer-Ming
    Colasanti, Ric
    Conrad, Neal
    Davis, James J.
    Davison, Brian H.
    DeJongh, Matthew
    Devoid, Scott
    Dietrich, Emily
    Dubchak, Inna
    Edirisinghe, Janaka N.
    Fang, Gang
    Faria, Jose P.
    Frybarger, Paul M.
    Gerlach, Wolfgang
    Gerstein, Mark
    Greiner, Annette
    Gurtowski, James
    Haun, Holly L.
    He, Fei
    Jain, Rashmi
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (07) : 566 - 569
  • [6] MicroPheno: predicting environments and host phenotypes from 16S rRNA gene sequencing using a k-mer based representation of shallow sub-samples (vol 34, i32, 2018)
    Asgari, Ehsaneddin
    Garakani, Kiavash
    McHardy, Alice C.
    Mofrad, Mohammad R. K.
    [J]. BIOINFORMATICS, 2019, 35 (06) : 1082 - 1082
  • [7] BioContainers Registry: Searching Bioinformatics and Proteomics Tools, Packages, and Containers
    Bai, Jingwen
    Bandla, Chakradhar
    Guo, Jiaxin
    Alvarez, Roberto Vera
    Bai, Mingze
    Vizcaino, Juan Antonio
    Moreno, Pablo
    Gruening, Bjoern
    Sallou, Olivier
    Perez-Riverol, Yasset
    [J]. JOURNAL OF PROTEOME RESEARCH, 2021, 20 (04) : 2056 - 2061
  • [8] BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata
    Barrett, Tanya
    Clark, Karen
    Gevorgyan, Robert
    Gorelenkov, Vyacheslav
    Gribov, Eugene
    Karsch-Mizrachi, Ilene
    Kimelman, Michael
    Pruitt, Kim D.
    Resenchuk, Sergei
    Tatusova, Tatiana
    Yaschenko, Eugene
    Ostell, James
    [J]. NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) : D57 - D63
  • [9] Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3
    Beghini, Francesco
    McIver, Lauren J.
    Blanco-Miguez, Aitor
    Dubois, Leonard
    Asnicar, Francesco
    Maharjan, Sagun
    Mailyan, Ana
    Manghi, Paolo
    Scholz, Matthias
    Thomas, Andrew Maltez
    Valles-Colomer, Mireia
    Weingart, George
    Zhang, Yancong
    Zolfo, Moreno
    Huttenhower, Curtis
    Franzosa, Eric A.
    Segata, Nicola
    [J]. ELIFE, 2021, 10
  • [10] Bioboxes: standardised containers for interchangeable bioinformatics software
    Belmann, Peter
    Droege, Johannes
    Bremges, Andreas
    McHardy, Alice C.
    Sczyrba, Alexander
    Barton, Michael D.
    [J]. GIGASCIENCE, 2015, 4