A novel parallelization approach for hierarchical clustering

被引:28
作者
Du, Z [1 ]
Lin, F [1 ]
机构
[1] Nanyang Technol Univ, BioInformat Res Ctr, Singapore 639798, Singapore
关键词
clustering; parallelization; gene expression;
D O I
10.1016/j.parco.2005.01.001
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Identification of groups of genes that manifest similar expression patters is a key step in the analysis of gene expression data. Hierarchical clustering is developed for that purpose. A fundamental problem with the previous implementations of this clustering method is its limitation to handle large data sets within a reasonable time and memory resources. In this paper, we present a parallel approach for solving this problem. Implementation of the parallel algorithm is illustrated on data from high dimensional microarray experiments related to the gene expression in cancerous disease and Arabidopsis seedling growth. They show considerable reduction in computational time and inter-node communication overhead, especially for large data sets. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:523 / 527
页数:5
相关论文
共 9 条
[1]   Open source clustering software [J].
de Hoon, MJL ;
Imoto, S ;
Nolan, J ;
Miyano, S .
BIOINFORMATICS, 2004, 20 (09) :1453-1454
[2]   RELAXED HEAPS - AN ALTERNATIVE TO FIBONACCI HEAPS WITH APPLICATIONS TO PARALLEL COMPUTATION [J].
DRISCOLL, JR ;
GABOW, HN ;
SHRAIRMAN, R ;
TARJAN, RE .
COMMUNICATIONS OF THE ACM, 1988, 31 (11) :1343-1354
[3]  
Eisen MB, 1999, METHOD ENZYMOL, V303, P179
[4]  
KAMESH M, 2004, BMC BIOINFORMATICS, V5, P21
[5]   PARALLEL CLUSTERING ALGORITHMS [J].
LI, X ;
FANG, Z .
PARALLEL COMPUTING, 1989, 11 (03) :275-290
[6]   PARALLEL ALGORITHMS FOR HIERARCHICAL-CLUSTERING [J].
OLSON, CF .
PARALLEL COMPUTING, 1995, 21 (08) :1313-1325
[7]   SHORTEST CONNECTION NETWORKS AND SOME GENERALIZATIONS [J].
PRIM, RC .
BELL SYSTEM TECHNICAL JOURNAL, 1957, 36 (06) :1389-1401
[8]   EFFICIENCY OF HIERARCHIC AGGLOMERATIVE CLUSTERING USING THE ICL DISTRIBUTED ARRAY PROCESSOR [J].
RASMUSSEN, EM ;
WILLETT, P .
JOURNAL OF DOCUMENTATION, 1989, 45 (01) :1-24
[9]   SLINK - OPTIMALLY EFFICIENT ALGORITHM FOR SINGLE-LINK CLUSTER METHOD [J].
SIBSON, R .
COMPUTER JOURNAL, 1973, 16 (01) :30-34