Parallel BLAST on split databases

被引:40
作者
Mathog, DR [1 ]
机构
[1] CALTECH, Div Biol, Pasadena, CA 91125 USA
关键词
D O I
10.1093/bioinformatics/btg250
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BLAST programs often run on large SMP machines where multiple threads can work simultaneously and there is enough memory to cache the databases between program runs. A group of programs is described which allows comparable performance to be achieved with a Beowulf configuration in which no node has enough memory to cache a database but the cluster as an aggregate does. To achieve this result, databases are split into equal sized pieces and stored locally on each node, Each query is run on all nodes in parallel and the resultant BLAST output files from all nodes merged to 1 yield the final output.
引用
收藏
页码:1865 / 1866
页数:2
相关论文
共 4 条
  • [1] Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
    Altschul, SF
    Madden, TL
    Schaffer, AA
    Zhang, JH
    Zhang, Z
    Miller, W
    Lipman, DJ
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (17) : 3389 - 3402
  • [2] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [3] BeoBLAST: Distributed BLAST and PSI-BLAST on a beowulf cluster
    Grant, JD
    Dunbrack, RL
    Manion, FJ
    Ochs, MF
    [J]. BIOINFORMATICS, 2002, 18 (05) : 765 - 766
  • [4] CDD: a database of conserved domain alignments with links to domain three-dimensional structure
    Marchler-Bauer, A
    Panchenko, AR
    Shoemaker, BA
    Thiessen, PA
    Geer, LY
    Bryant, SH
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 281 - 283