Fuzzy C-means method for clustering microarray data

被引:343
作者
Dembélé, D [1 ]
Kastner, P [1 ]
机构
[1] ULP, CNRS, IMSERM, Inst Genet & Biol Mol & Cellulaire, F-67404 Illkirch Graffenstaden, France
关键词
D O I
10.1093/bioinformatics/btg119
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide information about the influence of a given gene for the overall shape of clusters. Here we apply a fuzzy partitioning method, Fuzzy C-means (FCM), to attribute cluster membership values to genes. Results: A major problem in applying the FCM method for clustering microarray data is the choice of the fuzziness parameter m. We show that the commonly used value m = 2 is not appropriate for some data sets, and that optimal values for m vary widely from one data set to another. We propose an empirical method, based on the distribution of distances between genes in a given data set, to determine an adequate value for m. By setting threshold levels for the membership values, genes which are tigthly associated to a given cluster can be selected. Using a yeast cell cycle data set as an example, we show that this selection increases the overall biological significance of the genes within the cluster.
引用
收藏
页码:973 / 980
页数:8
相关论文
共 14 条
  • [1] [Anonymous], Pattern Recognition With Fuzzy Objective Function Algorithms
  • [2] ON SOME SIGNIFICANCE TESTS IN CLUSTER-ANALYSIS
    BOCK, HH
    [J]. JOURNAL OF CLASSIFICATION, 1985, 2 (01) : 77 - 108
  • [3] A genome-wide transcriptional analysis of the mitotic cell cycle
    Cho, RJ
    Campbell, MJ
    Winzeler, EA
    Steinmetz, L
    Conway, A
    Wodicka, L
    Wolfsberg, TG
    Gabrielian, AE
    Landsman, D
    Lockhart, DJ
    Davis, RW
    [J]. MOLECULAR CELL, 1998, 2 (01) : 65 - 73
  • [4] Inference from clustering with application to gene-expression microarrays
    Dougherty, ER
    Barrera, J
    Brun, M
    Kim, S
    Cesar, RM
    Chen, YD
    Bittner, M
    Trent, JM
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (01) : 105 - 126
  • [5] Cluster analysis and display of genome-wide expression patterns
    Eisen, MB
    Spellman, PT
    Brown, PO
    Botstein, D
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) : 14863 - 14868
  • [6] HOAGLIN DC, 2002, UNDERSTANDING ROBUST
  • [7] The transcriptional program in the response of human fibroblasts to serum
    Iyer, VR
    Eisen, MB
    Ross, DT
    Schuler, G
    Moore, T
    Lee, JCF
    Trent, JM
    Staudt, LM
    Hudson, J
    Boguski, MS
    Lashkari, D
    Shalon, D
    Botstein, D
    Brown, PO
    [J]. SCIENCE, 1999, 283 (5398) : 83 - 87
  • [8] APPLICATION OF FUZZY-SETS TO CLIMATIC CLASSIFICATION
    MCBRATNEY, AB
    MOORE, AW
    [J]. AGRICULTURAL AND FOREST METEOROLOGY, 1985, 35 (1-4) : 165 - 185
  • [9] NONUNIQUENESS AND INVERSIONS IN CLUSTER-ANALYSIS
    MORGAN, BJT
    RAY, APG
    [J]. APPLIED STATISTICS-JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C, 1995, 44 (01): : 117 - 134
  • [10] SILHOUETTES - A GRAPHICAL AID TO THE INTERPRETATION AND VALIDATION OF CLUSTER-ANALYSIS
    ROUSSEEUW, PJ
    [J]. JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 1987, 20 : 53 - 65