BGBlast: A BLAST Grid Implementation with Database Self-Updating and Adaptive Replication

被引:0
作者
Trombetti, Gabriele A. [1 ]
Merelli, Ivan [1 ]
Orro, Alessandro [1 ]
Milanesi, Luciano [1 ]
机构
[1] CNR, Inst Biomed Technol, I-20090 Segrate, MI, Italy
来源
FROM GENES TO PERSONALIZED HEALTHCARE: GRID SOLUTIONS FOR THE LIFE SCIENCES | 2007年 / 126卷
关键词
Bioinformatics; adaptive database replication;
D O I
暂无
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
BLAST is probably the most used application in bioinformatics teams. BLAST complexity tends to be a concern when the query sequence sets and reference databases are large. Here we present BGBlast: an approach for handling the computational complexity of large BLAST executions by porting BLAST to the Grid platform, leveraging the power of the thousands of CPUs which compose the EGEE infrastructure. BGBlast provides innovative features for efficiently managing BLAST databases in the distributed Grid environment. The system (1) keeps the databases constantly up to date while still allowing the user to regress to earlier versions, (2) stores the older versions of databases on the Grid with a time and space efficient delta encoding and (3) manages the number of replicas for each database over the Grid with an adaptive algorithm, dynamically balancing between execution parallelism and storage costs.
引用
收藏
页码:23 / 30
页数:8
相关论文
共 13 条
  • [1] BASIC LOCAL ALIGNMENT SEARCH TOOL
    ALTSCHUL, SF
    GISH, W
    MILLER, W
    MYERS, EW
    LIPMAN, DJ
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) : 403 - 410
  • [2] [Anonymous], BLAST
  • [3] BAYER M, 2004, P UK E SCI ALL HANDS
  • [4] Darling AE, 2003, 4 INT C LIN CLUST JU
  • [5] Kent WJ, 2002, GENOME RES, V12, P656, DOI [10.1101/gr.229202. Article published online before March 2002, 10.1101/gr.229202]
  • [6] KONISHI F, 2003, GENOME INFORMATICS, V14, P535
  • [7] PatternHunter: faster and more sensitive homology search
    Ma, B
    Tromp, J
    Li, M
    [J]. BIOINFORMATICS, 2002, 18 (03) : 440 - 445
  • [8] Parallel BLAST on split databases
    Mathog, DR
    [J]. BIOINFORMATICS, 2003, 19 (14) : 1865 - 1866
  • [9] MERELLI I, 2005, P BITS, P59
  • [10] Qi YT, 2005, CELL MOL BIOL LETT, V10, P281