An optimal hierarchical clustering algorithm for gene expression data

被引:12
作者
Seal, S [1 ]
Komarina, S [1 ]
Aluru, S [1 ]
机构
[1] Iowa State Univ Sci & Technol, Dept Elect & Comp Engn, Ames, IA 50011 USA
关键词
algorithms; computational geometry; gene expression; hierarchical clustering; microarrays;
D O I
10.1016/j.ipl.2004.11.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Microarrays are used for measuring expression levels of thousands of genes simultaneously. Clustering algorithms are used on gene expression data to find co-regulated genes. An often used clustering strategy is the Pearson correlation coefficient based hierarchical clustering algorithm presented in [Proc. Nat. Acad. Sci. 95 (25) (1998) 14863-14868], which takes O(N-3) time. We note that this run time can be reduced to O(N-2) by applying known hierarchical clustering algorithms [Proc. 9th Annual ACM-SIAM Symposium on Discrete Algorithms, 1998, pp. 619-628] to this problem. In this paper. we present an algorithm which runs in O(N log N) time using a geometrical reduction and show that it is optimal. (C) 2004 Elsevier B.V. All rights reserved.
引用
收藏
页码:143 / 147
页数:5
相关论文
共 7 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]   An optimal algorithm for closest-pair maintenance [J].
Bespamyatnikh, SN .
DISCRETE & COMPUTATIONAL GEOMETRY, 1998, 19 (02) :175-195
[3]  
Callahan P. B., 1992, Proceedings of the Twenty-Fourth Annual ACM Symposium on the Theory of Computing, P546, DOI 10.1145/129712.129766
[4]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[5]  
Eppstein D, 1998, P 9 ACM SIAM S DISCR, P619
[6]   Gene expression profiles during the initial phase of salt stress in rice [J].
Kawasaki, S ;
Borchert, C ;
Deyholos, M ;
Wang, H ;
Brazille, S ;
Kawai, K ;
Galbraith, D ;
Bohnert, HJ .
PLANT CELL, 2001, 13 (04) :889-905
[7]   DNA microarray analysis of gene expression in response to physiological and genetic changes that affect tryptophan metabolism in Escherichia coli [J].
Khodursky, AB ;
Peter, BJ ;
Cozzarelli, NR ;
Botstein, D ;
Brown, PO ;
Yanofsky, C .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (22) :12170-12175