Personalized cloud-based bioinformatics services for research and education: use cases and the elasticHPC package

被引:14
作者
El-Kalioby, Mohamed [1 ]
Abouelhoda, Mohamed [1 ,2 ]
Krueger, Jan [3 ]
Giegerich, Robert [3 ]
Sczyrba, Alexander [3 ]
Wall, Dennis P. [4 ]
Tonellato, Peter [4 ]
机构
[1] Nile Univ, Ctr Informat Sci, Giza, Egypt
[2] Cairo Univ, Fac Engn, Giza 12211, Egypt
[3] Univ Bielefeld, Fac Technol, D-33615 Bielefeld, Germany
[4] Harvard Univ, Ctr Biomed Informat, Sch Med, Cambridge, MA 02138 USA
关键词
GALAXY; ALIGNMENT;
D O I
10.1186/1471-2105-13-S17-S22
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Bioinformatics services have been traditionally provided in the form of a web-server that is hosted at institutional infrastructure and serves multiple users. This model, however, is not flexible enough to cope with the increasing number of users, increasing data size, and new requirements in terms of speed and availability of service. The advent of cloud computing suggests a new service model that provides an efficient solution to these problems, based on the concepts of "resources-on-demand" and "pay-as-you-go". However, cloud computing has not yet been introduced within bioinformatics servers due to the lack of usage scenarios and software layers that address the requirements of the bioinformatics domain. Results: In this paper, we provide different use case scenarios for providing cloud computing based services, considering both the technical and financial aspects of the cloud computing service model. These scenarios are for individual users seeking computational power as well as bioinformatics service providers aiming at provision of personalized bioinformatics services to their users. We also present elasticHPC, a software package and a library that facilitates the use of high performance cloud computing resources in general and the implementation of the suggested bioinformatics scenarios in particular. Concrete examples that demonstrate the suggested use case scenarios with whole bioinformatics servers and major sequence analysis tools like BLAST are presented. Experimental results with large datasets are also included to show the advantages of the cloud model. Conclusions: Our use case scenarios and the elasticHPC package are steps towards the provision of cloud based bioinformatics services, which would help in overcoming the data challenge of recent biological research. All resources related to elasticHPC and its web-interface are available at http://www.elasticHPC.org.
引用
收藏
页数:17
相关论文
共 33 条
[1]   Tavaxy: Integrating Taverna and Galaxy workflows with cloud computing support [J].
Abouelhoda, Mohamed ;
Issa, Shadi Alaa ;
Ghanem, Moustafa .
BMC BIOINFORMATICS, 2012, 13
[2]   Galaxy CloudMan: delivering cloud compute clusters [J].
Afgan, Enis ;
Baker, Dannon ;
Coraor, Nate ;
Chapman, Brad ;
Nekrutenko, Anton ;
Taylor, James .
BMC BIOINFORMATICS, 2010, 11
[3]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[4]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[5]  
[Anonymous], FASTX TOOLKIT
[6]   Cloud computing [J].
Bateman, Alex ;
Wood, Matt .
BIOINFORMATICS, 2009, 25 (12) :1475-1475
[7]   In silico research in the era of cloud computing [J].
Dudley, Joel T. ;
Butte, Atul J. .
NATURE BIOTECHNOLOGY, 2010, 28 (11) :1181-1185
[8]   Biomedical Cloud Computing With Amazon Web Services [J].
Fusaro, Vincent A. ;
Patil, Prasad ;
Gafni, Erik ;
Wall, Dennis P. ;
Tonellato, Peter J. .
PLOS COMPUTATIONAL BIOLOGY, 2011, 7 (08)
[9]   Galaxy: A platform for interactive large-scale genome analysis [J].
Giardine, B ;
Riemer, C ;
Hardison, RC ;
Burhans, R ;
Elnitski, L ;
Shah, P ;
Zhang, Y ;
Blankenberg, D ;
Albert, I ;
Taylor, J ;
Miller, W ;
Kent, WJ ;
Nekrutenko, A .
GENOME RESEARCH, 2005, 15 (10) :1451-1455
[10]   Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences [J].
Goecks, Jeremy ;
Nekrutenko, Anton ;
Taylor, James .
GENOME BIOLOGY, 2010, 11 (08)