Data Lakes, Clouds, and Commons: A Review of Platforms for Analyzing and Sharing Genomic Data

被引:37
作者
Grossman, Robert L. [1 ]
机构
[1] Univ Chicago, Ctr Translat Data Sci, 900 East 57th St,KCBD 10142, Chicago, IL 60637 USA
关键词
CANCER; VISION; BROWSER; GALAXY;
D O I
10.1016/j.tig.2018.12.006
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Data commons collate data with cloud computing infrastructure and commonly used software services, tools, and applications to create biomedical resources for the large-scale management, analysis, harmonization, and sharing of biomedical data. Over the past few years, data commons have been used to analyze, harmonize, and share large-scale genomics datasets. Data ecosystems can be built by interoperating multiple data commons. It can be quite labor intensive to curate, import, and analyze the data in a data commons. Data lakes provide an alternative to data commons and simply provide access to data, with the data curation and analysis deferred until later and delegated to those that access the data. We review software platforms for managing, analyzing, and sharing genomic data, with an emphasis on data commons, but also cover data ecosystems and data lakes.
引用
收藏
页码:223 / 234
页数:12
相关论文
共 62 条
[1]   Harnessing cloud computing with Galaxy Cloud [J].
Afgan, Enis ;
Baker, Dannon ;
Coraor, Nate ;
Goto, Hiroki ;
Paul, Ian M. ;
Makova, Kateryna D. ;
Nekrutenko, Anton ;
Taylor, James .
NATURE BIOTECHNOLOGY, 2011, 29 (11) :972-974
[2]   Galaxy CloudMan: delivering cloud compute clusters [J].
Afgan, Enis ;
Baker, Dannon ;
Coraor, Nate ;
Chapman, Brad ;
Nekrutenko, Anton ;
Taylor, James .
BMC BIOINFORMATICS, 2010, 11
[3]  
Alterovitz G., 2018, 191783 BIORXIV
[4]  
[Anonymous], BIORXIV
[5]  
[Anonymous], 2011, NIST DEFINITION CLOU
[6]  
[Anonymous], 2009, DEP ELECT ENG COMPUT
[7]  
[Anonymous], 2016, COMMON WORKFLOW LANG
[8]   GENBANK [J].
BENSON, D ;
LIPMAN, DJ ;
OSTELL, J .
NUCLEIC ACIDS RESEARCH, 1993, 21 (13) :2963-2965
[9]  
Birger C., 2017, 209494 BIORXIV
[10]  
Boettiger Carl, 2015, ACM SIGOPS Operating Systems Review, V49, P71