A Fuzzy Threshold Based Modified Clustering Algorithm for Natural Data Exploration

被引:0
作者
Thomas, Binu [1 ]
Raju, G. [2 ]
机构
[1] Marian Coll, Dept Comp Applicat, Kuttiikkanam, Kerala, India
[2] Kannur Univ, Dept Informat Technol, Kannur, Kerala, India
来源
INTELLIGENCE AND SECURITY INFORMATICS, PROCEEDINGS | 2010年 / 6122卷
关键词
Clustering; data mining; fuzzy c-means; fuzzy clustering; unsupervised clustering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional supervised clustering methods require the user to provide the number of clusters before we start any data exploration. The data engineer also has to select the initial cluster seeds. In c-means clustering method, the performance efficiency of the algorithm depends mainly on the initial selection of number of clusters and cluster seeds. With the real world data, the initial selection of cluster count and centroids becomes a tedious task. In this paper we propose a modified clustering algorithm which works on the principles of fuzzy clustering. The method we propose is using a modified form of popular fuzzy c-means algorithm for membership calculation. The algorithm begins on the assumption that all the data points are initial centroids. The clusters are continuously merged based on a threshold value until we get the optimum number of clusters. The algorithm is also capable of detecting the outliers The algorithm is tested with the data for Gross National Happiness (GNH) program of Bhutan and found to be highly efficient in segmenting natural data sets.
引用
收藏
页码:167 / +
页数:3
相关论文
共 12 条
  • [1] AU WH, 2001, P IEEE INT C DAT MIN
  • [2] Cox Earl., 2005, FUZZY MODELING GENET
  • [3] DONNELLY S, BHUTAN CAN DEV MEASU
  • [4] Data mining and knowledge discovery in databases
    Fayyad, U
    Uthurusamy, R
    [J]. COMMUNICATIONS OF THE ACM, 1996, 39 (11) : 24 - 26
  • [5] HALKIDI M, QUALITY ASSESSMENT U
  • [6] Han J., 2003, DATA MINING CONCEPTS
  • [7] The data warehouse and data mining
    Inmon, WH
    [J]. COMMUNICATIONS OF THE ACM, 1996, 39 (11) : 49 - 50
  • [8] KEITH CC, 2002, P 1 IEEE INT C COGN, P239
  • [9] Klir G.J., 1988, Fuzzy sets, uncertainty and information
  • [10] PAL K, 2002, IEEE T NEURAL NETWOR, V13