Variable Weighting in Fuzzy k-Means Clustering to Determine the Number of Clusters

被引:38
作者
Khan, Imran [1 ]
Luo, Zongwei [1 ]
Huang, Joshua Zhexue [2 ]
Shahzad, Waseem [3 ]
机构
[1] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen Key Lab Computat Intelligence, Shenzhen 518055, Guangdong, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Guangdong, Peoples R China
[3] Natl Univ Comp & Emerging Sci, Dept Comp Sci, Islamabad 44000, Pakistan
关键词
Fuzzy k-means; clustering; number of clusters; data mining; variable weighting; MEANS ALGORITHM; DATA SETS; SELECTION; CENTERS; MODEL;
D O I
10.1109/TKDE.2019.2911582
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the most significant problems in cluster analysis is to determine the number of clusters in unlabeled data, which is the input for most clustering algorithms. Some methods have been developed to address this problem. However, little attention has been paid on algorithms that are insensitive to the initialization of cluster centers and utilize variable weights to recover the number of clusters. To fill this gap, we extend the standard fuzzy k-means clustering algorithm. It can automatically determine the number of clusters by iteratively calculating the weights of all variables and the membership value of each object in all clusters. Two new steps are added to the fuzzy k-means clustering process. One of them is to introduce a penalty term to make the clustering process insensitive to the initial cluster centers. The other one is to utilize a formula for iterative updating of variable weights in each cluster based on the current partition of data. Experimental results on real-world and synthetic datasets have shown that the proposed algorithm effectively determined the correct number of clusters while initializing the different number of cluster centroids. We also tested the proposed algorithm on gene data to determine a subset of important genes.
引用
收藏
页码:1838 / 1853
页数:16
相关论文
共 50 条
  • [31] Clustering Fuzzy Web Transactions with Rough k-Means
    Shi, Peilin
    AST: 2009 INTERNATIONAL E-CONFERENCE ON ADVANCED SCIENCE AND TECHNOLOGY, PROCEEDINGS, 2009, : 48 - 51
  • [32] An Improved k-means Algorithm for Clustering Using Entropy Weighting Measures
    Li, Taoying
    Chen, Yan
    2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 149 - 153
  • [33] Adaptive Fuzzy Moving K-means Clustering Algorithm for Image Segmentation
    Isa, Nor Ashidi Mat
    Salamah, Samy A.
    Ngah, Umi Kalthum
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2009, 55 (04) : 2145 - 2153
  • [34] Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads
    Mark Ming-Tso Chiang
    Boris Mirkin
    Journal of Classification, 2010, 27 : 3 - 40
  • [35] A Comparative Study of K-Means, K-Means plus plus and Fuzzy C-Means Clustering Algorithms
    Kapoor, Akanksha
    Singhal, Abhishek
    2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE & COMMUNICATION TECHNOLOGY (CICT), 2017,
  • [36] D-optimality criterion for weighting variables in K-means clustering
    Yong B. Lim
    Yeo Jung Park
    Myung-Hoe Huh
    Journal of the Korean Statistical Society, 2009, 38 : 391 - 396
  • [37] Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads
    Chiang, Mark Ming-Tso
    Mirkin, Boris
    JOURNAL OF CLASSIFICATION, 2010, 27 (01) : 3 - 40
  • [38] Distance based k-means clustering algorithm for determining number of clusters for high dimensional data
    Alibuhtto, Mohamed Cassim
    Mahat, Nor Idayu
    DECISION SCIENCE LETTERS, 2020, 9 (01) : 51 - 58
  • [39] k-means and fuzzy c-means fusion for object clustering
    Heni, Ashraf
    Jdey, Imen
    Ltifi, Hela
    2022 8TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'22), 2022, : 177 - 182
  • [40] A fast method for discovering suitable number of clusters for fuzzy clustering
    Hsu, Ping-Yu
    Phan-Anh-Huy Nguyen
    INTELLIGENT DATA ANALYSIS, 2022, 26 (06) : 1523 - 1538