High-performance integrated virtual environment (HIVE): a robust infrastructure for next-generation sequence data analysis

被引:54
作者
Simonyan, Vahan [1 ]
Chumakov, Konstantin [1 ]
Dingerdissen, Hayley [1 ,2 ]
Faison, William [2 ]
Goldweber, Scott [1 ,2 ]
Golikov, Anton [1 ]
Gulzar, Naila [2 ]
Karagiannis, Konstantinos [1 ,2 ]
Phuc Vinh Nguyen Lam [1 ]
Maudru, Thomas [1 ]
Muravitskaja, Olesja [1 ]
Osipova, Ekaterina [1 ]
Pan, Yang [2 ]
Pschenichnov, Alexey [1 ]
Rostovtsev, Alexandre [1 ]
Santana-Quintero, Luis [1 ]
Smith, Krista [1 ,2 ]
Thompson, Elaine E. [1 ]
Tkachenko, Valery [1 ]
Torcivia-Rodriguez, John [1 ,2 ]
Voskanian, Alin [1 ]
Wan, Quan [2 ]
Wang, Jing [1 ]
Wu, Tsung-Jung [2 ]
Wilson, Carolyn [1 ]
Mazumder, Raja [2 ]
机构
[1] US FDA, Ctr Biol Evaluat & Res, Silver Spring, MD 20993 USA
[2] George Washington Univ, Dept Biochem & Mol Biol, Med Ctr, Washington, DC 20037 USA
来源
DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION | 2016年
关键词
ALIGNMENT; POLIOMYELITIS; BIOLOGY;
D O I
10.1093/database/baw022
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
The High-performance Integrated Virtual Environment (HIVE) is a distributed storage and compute environment designed primarily to handle next-generation sequencing (NGS) data. This multicomponent cloud infrastructure provides secure web access for authorized users to deposit, retrieve, annotate and compute on NGS data, and to analyse the outcomes using web interface visual environments appropriately built in collaboration with research and regulatory scientists and other end users. Unlike many massively parallel computing environments, HIVE uses a cloud control server which virtualizes services, not processes. It is both very robust and flexible due to the abstraction layer introduced between computational requests and operating system processes. The novel paradigm of moving computations to the data, instead of moving data to computational nodes, has proven to be significantly less taxing for both hardware and network infrastructure. The honeycomb data model developed for HIVE integrates metadata into an object-oriented model. Its distinction from other object-oriented databases is in the additional implementation of a unified application program interface to search, view and manipulate data of all types. This model simplifies the introduction of new data types, thereby minimizing the need for database restructuring and streamlining the development of new integrated information systems. The honeycomb model employs a highly secure hierarchical access control and permission system, allowing determination of data access privileges in a finely granular manner without flooding the security subsystem with a multiplicity of rules. HIVE infrastructure will allow engineers and scientists to perform NGS analysis in a manner that is both efficient and secure. HIVE is actively supported in public and private domains, and project collaborations are welcomed.
引用
收藏
页数:16
相关论文
共 23 条
[1]   Viral security proteins: counteracting host defences [J].
Agol, Vadim I. ;
Gmyl, Anatoly P. .
NATURE REVIEWS MICROBIOLOGY, 2010, 8 (12) :867-878
[2]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[3]  
[Anonymous], 5 INT C BRAZ ASS BIO
[4]  
Benson DA, 2010, NUCLEIC ACIDS RES, V38, pD46, DOI [10.1093/nar/gkp1024, 10.1093/nar/gkx1094, 10.1093/nar/gkl986, 10.1093/nar/gkw1070, 10.1093/nar/gks1195, 10.1093/nar/gkn723, 10.1093/nar/gkg057, 10.1093/nar/gkr1202, 10.1093/nar/gkq1079]
[5]   D3: Data-Driven Documents [J].
Bostock, Michael ;
Ogievetsky, Vadim ;
Heer, Jeffrey .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2011, 17 (12) :2301-2309
[6]   Representative Proteomes: A Stable, Scalable and Unbiased Proteome Set for Sequence Analysis and Functional Annotation [J].
Chen, Chuming ;
Natale, Darren A. ;
Finn, Robert D. ;
Huang, Hongzhan ;
Zhang, Jian ;
Wu, Cathy H. ;
Mazumder, Raja .
PLOS ONE, 2011, 6 (04)
[7]   Eradication of poliomyelitis and emergence of pathogenic vaccine-derived polioviruses: from Madagascar to Cameroon [J].
Delpeyroux, Francis ;
Colbere-Garapin, Florence ;
Razafindratsimandresy, Richter ;
Sadeuh-Mba, Serge ;
Joffret, Marie-Line ;
Rousset, Dominique ;
Blondel, Bruno .
M S-MEDECINE SCIENCES, 2013, 29 (11) :1034-1041
[8]   The 2011 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection [J].
Galperin, Michael Y. ;
Cochrane, Guy R. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :D1-D6
[9]   MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability [J].
Katoh, Kazutaka ;
Standley, Daron M. .
MOLECULAR BIOLOGY AND EVOLUTION, 2013, 30 (04) :772-780
[10]   Ultrafast and memory-efficient alignment of short DNA sequences to the human genome [J].
Langmead, Ben ;
Trapnell, Cole ;
Pop, Mihai ;
Salzberg, Steven L. .
GENOME BIOLOGY, 2009, 10 (03)