D-ASSIRC:: distributed program for finding sequence similarities in genomes

被引:3
作者
Vincens, P
Badel-Chagnon, A
André, C
Hazout, S
机构
[1] Ecole Normale Super, Dept Biol FR 36, F-75230 Paris 05, France
[2] Univ Paris 07, INSERM, U436, Equipe Bioinformat Genom & Mol, F-75251 Paris, France
[3] CHU Pitie Salpetriere, Dept Biomath, F-75634 Paris 13, France
关键词
D O I
10.1093/bioinformatics/18.3.446
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Locating the regions of similarity in a genome requires the availability of appropriate tools such as 'Accelerated Search for Slmilar Regions in Chromosomes' (ASSIRC; Vincens et al., Bioinformatics, 14, 715-725, 1998). The aim of this paper is to present different strategies for improving this program by distributing the operations and data to multiple processing units and to assess the efficiency of the different implementations in terms of running time as a function of the number of processing units. Results: The new version D-ASSIRC is based on three alternative strategies of task sharing: (1) a distributed search using the splitting of studied sequences into large overlapping subsequences (strategy ASS); (2) two distributed searches for repeated exact motifs of fixed size either managed by a central processor (strategy AGD) or locally managed by numerous processors (strategy ALD). The result is that the strategy ASS is suitable for a large number of processing units (the time was divided by a factor of 12 when the number of processing units was increased from I to 16) wheras the strategy ALD is better for a small set of processors (typically for four or six). The different proposed strategies are efficient for various applications in genomic research, particularly for locating similarities of nucleic sequences in large genomes.
引用
收藏
页码:446 / 451
页数:6
相关论文
共 14 条
[1]   Analysis of intrachromosomal duplications in yeast Saccharomyces cerevisiae:: A possible model for their origin [J].
Achaz, G ;
Coissac, E ;
Viari, A ;
Netter, P .
MOLECULAR BIOLOGY AND EVOLUTION, 2000, 17 (08) :1268-1275
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
[4]  
[Anonymous], ALGORITHMS
[5]   SIMPLER DNA-SEQUENCE REPRESENTATIONS [J].
GATES, MA .
NATURE, 1985, 316 (6025) :219-219
[6]  
Goffeau A, 1998, PATHOL BIOL, V46, P96
[7]  
JULICH A, 1995, COMPUT APPL BIOSCI, V11, P3
[8]  
LEONG PM, 1995, COMPUT APPL BIOSCI, V11, P503
[9]   MIPS: A database for protein sequences, homology data and yeast genome information [J].
Mewes, HW ;
Albermann, K ;
Heumann, K ;
Liebl, S ;
Pfeiffer, F .
NUCLEIC ACIDS RESEARCH, 1997, 25 (01) :28-30
[10]   IMPROVED TOOLS FOR BIOLOGICAL SEQUENCE COMPARISON [J].
PEARSON, WR ;
LIPMAN, DJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1988, 85 (08) :2444-2448