A novel HMM-based clustering algorithm for the analysis of gene expression time-course data

被引:24
作者
Zeng, YJ [1 ]
Garcia-Frias, J [1 ]
机构
[1] Univ Delaware, Dept Elect & Comp Engn, Newark, DE 19716 USA
关键词
clustering; hidden Markov models; gene expression time-course data; time-series analysis; temporal dependences;
D O I
10.1016/j.csda.2005.07.007
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
A novel hidden Markov model (HMM) and clustering algorithm for the analysis of gene expression time-course data is proposed. The proposed model, called the profile-HMM, is specifically designed to explicitly take into account the dynamic nature of temporal gene expression profiles, which is ignored by many clustering methods existing in the literature. In this model, gene expression dynamics are represented by a special set of paths, with each path characterizing a stochastic pattern. The profile-HMM is trained to contain the most likely set of stochastic patterns given the dynamic microarray data, and the clustering result is obtained by grouping together the time-series that are most likely to be related to the same pattern. The novelty of the method is that the behavior of the whole gene expression dataset is modeled by a single HMM acting as a self-organizing map, so that all the clusters are implicitly and jointly defined in the model during the training phase. An attractive property of the profile-HMM clustering algorithm is its ability to automatically identify the number of clusters. The resulting performance is demonstrated by its application on simulated and biological data. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:2472 / 2494
页数:23
相关论文
共 22 条
  • [1] [Anonymous], 329 U WASH DEP STAT, DOI DOI 10.1093/oep/gpr002
  • [2] BARJOSEPH Z, 2002, 6 ANN INT C RES COMP
  • [3] Bicego M, 2003, LECT NOTES ARTIF INT, V2734, P86
  • [4] The transcriptional program of sporulation in budding yeast
    Chu, S
    DeRisi, J
    Eisen, M
    Mulholland, J
    Botstein, D
    Brown, PO
    Herskowitz, I
    [J]. SCIENCE, 1998, 282 (5389) : 699 - 705
  • [5] CLUSTER SEPARATION MEASURE
    DAVIES, DL
    BOULDIN, DW
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) : 224 - 227
  • [6] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [7] Gordon A, 1999, Classification
  • [8] On clustering validation techniques
    Halkidi, M
    Batistakis, Y
    Vazirgiannis, M
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2001, 17 (2-3) : 107 - 145
  • [9] The transcriptional program in the response of human fibroblasts to serum
    Iyer, VR
    Eisen, MB
    Ross, DT
    Schuler, G
    Moore, T
    Lee, JCF
    Trent, JM
    Staudt, LM
    Hudson, J
    Boguski, MS
    Lashkari, D
    Shalon, D
    Botstein, D
    Brown, PO
    [J]. SCIENCE, 1999, 283 (5398) : 83 - 87
  • [10] Jain K, 1988, Algorithms for clustering data