Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis Visualization, and Informatics Lab-space

被引:75
作者
Schatz, Michael C. [1 ,2 ]
Philippakis, Anthony A. [3 ]
Afgan, Enis [1 ]
Banks, Eric [3 ]
Carey, Vincent J. [4 ]
Carroll, Robert J. [5 ]
Culotti, Alessandro [3 ,6 ]
Ellrott, Kyle [7 ]
Goecks, Jeremy [7 ]
Grossman, Robert L. [6 ]
Hall, Ira M. [8 ]
Hansen, Kasper D. [9 ]
Lawson, Jonathan [3 ]
Leek, Jeffrey T. [9 ]
Luria, Anne O'Donnell [3 ]
Mosher, Stephen [1 ]
Morgan, Martin [10 ]
Nekrutenko, Anton [11 ]
O'Connor, Brian D. [3 ]
Osborn, Kevin [12 ]
Paten, Benedict [12 ]
Patterson, Candace [3 ]
Tan, Frederick J. [13 ]
Taylor, Casey Overby [14 ]
Vessio, Jennifer [1 ]
Waldron, Levi [15 ]
Wang, Ting [16 ]
Wuichet, Kristin [5 ]
机构
[1] Johns Hopkins Univ, Dept Biol, Baltimore, MD 21218 USA
[2] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD 21218 USA
[3] Broad Inst MIT & Harvard, Cambridge, MA 02142 USA
[4] Harvard Univ, Harvard Med Sch, Cambridge, MA USA
[5] Vanderbilt Univ, Med Ctr, Dept Biomed Informat, Nashville, TN USA
[6] Univ Chicago, Ctr Translat Data Sci, Chicago, IL USA
[7] Oregon Hlth & Sci Univ, Biomed Engn, Portland, OR USA
[8] Yale Univ, Yale Sch Med, New Haven, CT USA
[9] Johns Hopkins Univ, Dept Biostat, Baltimore, MD USA
[10] Roswell Pk Comprehens Canc Ctr, Dept Biostat & Bioinformat, Buffalo, NY USA
[11] Penn State Univ, Dept Biochem & Mol Biol, State Coll, PA USA
[12] UC Santa Cruz Genom Inst, UC Santa Cruz, Santa Cruz, CA USA
[13] Carnegie Inst, Dept Embryol, Baltimore, MD USA
[14] Johns Hopkins Univ, Dept Med, Baltimore, MD USA
[15] City Univ New York, Grad Sch Publ Hlth & Hlth Policy, Dept Epidemiol & Biostat, New York, NY USA
[16] Washington Univ St Louis, Dept Genet, St Louis, MO USA
来源
CELL GENOMICS | 2022年 / 2卷 / 01期
关键词
TRANSCRIPTION-FACTOR; CHROMATIN ACCESSIBILITY; EMBRYONIC-DEVELOPMENT; REVEALS PRINCIPLES; BINDING PROTEINS; CELL; GENE; EXPRESSION; SEQUENCE; DIFFERENTIATION;
D O I
10.1016/j.xgen.2021.100085
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
The NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL; https:// anvilproject.org) was developed to address a widespread community need for a unified computing environment for genomics data storage, management, and analysis. In this perspective, we present AnVIL, describe its ecosystem and interoperability with other platforms, and highlight how this platform and associated initiatives contribute to improved genomic data sharing efforts. The AnVIL is a federated cloud platform de-signed to manage and store genomics and related data, enable population-scale analysis, and facilitate collaboration through the sharing of data, code, and analysis results. By inverting the traditional model of data sharing, the AnVIL eliminates the need for data movement while also adding security measures for active threat detection and monitoring and provides scalable, shared computing resources for any researcher. We describe the core data management and analysis components of the AnVIL, which currently consists of Terra, Gen3, Galaxy, RStudio/Bioconductor, Dockstore, and Jupyter, and describe several flagship genomics data -sets available within the AnVIL. We continue to extend and innovate the AnVIL ecosystem by implementing new capabilities, including mechanisms for interoperability and responsible data sharing, while streamlining access management. The AnVIL opens many new opportunities for analysis, collaboration, and data sharing that are needed to drive research and to make discoveries through the joint analysis of hundreds of thousands to millions of genomes along with associated clinical and molecular data types.
引用
收藏
页数:13
相关论文
共 109 条
[1]  
Aibar S, 2017, NAT METHODS, V14, P1083, DOI [10.1038/NMETH.4463, 10.1038/nmeth.4463]
[2]   Whole-organism clone tracing using single-cell sequencing [J].
Alemany, Anna ;
Florescu, Maria ;
Baron, Chloe S. ;
Peterson-Maduro, Josi ;
van Oudenaarden, Alexander .
NATURE, 2018, 556 (7699) :108-+
[3]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[4]   Haplotype-resolved whole-genome sequencing by contiguity-preserving transposition and combinatorial indexing [J].
Amini, Sasan ;
Pushkarev, Dmitry ;
Christiansen, Lena ;
Kostem, Emrah ;
Royce, Tom ;
Turk, Casey ;
Pignatelli, Natasha ;
Adey, Andrew ;
Kitzman, Jacob O. ;
Vijayan, Kandaswamy ;
Ronaghi, Mostafa ;
Shendure, Jay ;
Gunderson, Kevin L. ;
Steemers, Frank J. .
NATURE GENETICS, 2014, 46 (12) :1343-1349
[5]   Zebrafish mutants and TEAD reporters reveal essential functions for Yap and Taz in posterior cardinal vein development [J].
Astone, Matteo ;
Lai, Jason Kuan Han ;
Dupont, Sirio ;
Stainier, Didier Y. R. ;
Argenton, Francesco ;
Vettori, Andrea .
SCIENTIFIC REPORTS, 2018, 8
[6]   MEME SUITE: tools for motif discovery and searching [J].
Bailey, Timothy L. ;
Boden, Mikael ;
Buske, Fabian A. ;
Frith, Martin ;
Grant, Charles E. ;
Clementi, Luca ;
Ren, Jingyuan ;
Li, Wilfred W. ;
Noble, William S. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :W202-W208
[7]   Harnessing a high cargo-capacity transposon for genetic applications in vertebrates [J].
Balciunas, Darius ;
Wangensteen, Kirk J. ;
Wilber, Andrew ;
Bell, Jason ;
Geurts, Aron ;
Sivasubbu, Sridhar ;
Wang, Xin ;
Hackett, Perry B. ;
Largaespada, David A. ;
McIvor, R. Scott ;
Ekker, Stephen C. .
PLOS GENETICS, 2006, 2 (11) :1715-1724
[8]  
BALLARD WW, 1981, AM ZOOL, V21, P391
[9]   The mouse homeobox gene Noto regulates node morphogenesis, notochordal ciliogenesis, and left-right patterning [J].
Beckers, Anja ;
Alten, Leonie ;
Viebahn, Christoph ;
Andre, Philipp ;
Gossler, Achim .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (40) :15765-15770
[10]   Coding exons function as tissue-specific enhancers of nearby genes [J].
Birnbaum, Ramon Y. ;
Clowney, E. Josephine ;
Agamy, Orly ;
Kim, Mee J. ;
Zhao, Jingjing ;
Yamanaka, Takayuki ;
Pappalardo, Zachary ;
Clarke, Shoa L. ;
Wenger, Aaron M. ;
Loan Nguyen ;
Gurrieri, Fiorella ;
Everman, David B. ;
Schwartz, Charles E. ;
Birk, Ohad S. ;
Bejerano, Gill ;
Lomvardas, Stavros ;
Ahituv, Nadav .
GENOME RESEARCH, 2012, 22 (06) :1059-1068