Fuzzy C-means method for clustering microarray data

被引:344
|
作者
Dembélé, D [1 ]
Kastner, P [1 ]
机构
[1] ULP, CNRS, IMSERM, Inst Genet & Biol Mol & Cellulaire, F-67404 Illkirch Graffenstaden, France
关键词
D O I
10.1093/bioinformatics/btg119
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Clustering analysis of data from DNA microarray hybridization studies is essential for identifying biologically relevant groups of genes. Partitional clustering methods such as K-means or self-organizing maps assign each gene to a single cluster. However, these methods do not provide information about the influence of a given gene for the overall shape of clusters. Here we apply a fuzzy partitioning method, Fuzzy C-means (FCM), to attribute cluster membership values to genes. Results: A major problem in applying the FCM method for clustering microarray data is the choice of the fuzziness parameter m. We show that the commonly used value m = 2 is not appropriate for some data sets, and that optimal values for m vary widely from one data set to another. We propose an empirical method, based on the distribution of distances between genes in a given data set, to determine an adequate value for m. By setting threshold levels for the membership values, genes which are tigthly associated to a given cluster can be selected. Using a yeast cell cycle data set as an example, we show that this selection increases the overall biological significance of the genes within the cluster.
引用
收藏
页码:973 / 980
页数:8
相关论文
共 50 条
  • [21] A New Fuzzy c-Means Clustering Algorithm for Interval Data
    Jin, Yan
    Ma, Jianghong
    2013 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE (ICCSAI 2013), 2013, : 156 - 159
  • [22] Fuzzy c-means clustering methods for symbolic interval data
    de Carvalho, Francisco de A. T.
    PATTERN RECOGNITION LETTERS, 2007, 28 (04) : 423 - 437
  • [23] Interval kernel Fuzzy C-Means clustering of incomplete data
    Li, Tianhao
    Zhang, Liyong
    Lu, Wei
    Hou, Hui
    Liu, Xiaodong
    Pedrycz, Witold
    Zhong, Chongquan
    NEUROCOMPUTING, 2017, 237 : 316 - 331
  • [24] Generalized fuzzy c-means clustering in the presence of outlying data
    Hathaway, RJ
    Overstreet, DD
    Hu, YK
    Davenport, JW
    APPLICATIONS AND SCIENCE OF COMPUTATIONAL INTELLIGENCE II, 1999, 3722 : 509 - 517
  • [25] Application of Fuzzy c-Means Clustering in Data Analysis of Metabolomics
    Li, Xiang
    Lu, Xin
    Tian, Jing
    Gao, Peng
    Kong, Hongwei
    Xu, Guowang
    ANALYTICAL CHEMISTRY, 2009, 81 (11) : 4468 - 4475
  • [26] Fuzzy C-means clustering algorithm based on incomplete data
    Jia, Zhiping
    Yu, Zhiqiang
    Zhang, Chenghui
    2006 IEEE INTERNATIONAL CONFERENCE ON INFORMATION ACQUISITION, VOLS 1 AND 2, CONFERENCE PROCEEDINGS, 2006, : 600 - 604
  • [27] Cluster Forests Based Fuzzy C-Means for Data Clustering
    Ben Ayed, Abdelkarim
    Ben Halima, Mohamed
    Alimi, Adel M.
    INTERNATIONAL JOINT CONFERENCE SOCO'16- CISIS'16-ICEUTE'16, 2017, 527 : 564 - 573
  • [28] Extended fuzzy c-means: an analyzing data clustering problems
    S. Ramathilagam
    R. Devi
    S. R. Kannan
    Cluster Computing, 2013, 16 : 389 - 406
  • [29] On Tolerant Fuzzy c-Means Clustering
    Hamasuna, Yukihiro
    Endo, Yasunori
    Miyamoto, Sadaaki
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2009, 13 (04) : 421 - 428
  • [30] Parallel fuzzy c-means clustering for large data sets
    Kwok, T
    Smith, K
    Lozan, S
    Taniar, D
    EURO-PAR 2002 PARALLEL PROCESSING, PROCEEDINGS, 2002, 2400 : 365 - 374