Incremental Clustering for Categorical Data Using Clustering Ensemble

被引:0
作者
Li Taoying [1 ]
Chne Yan [1 ]
Qu Lili [1 ]
Mu Xiangwei [1 ]
机构
[1] Dalian Maritime Univ, Transportat Management Coll, Dalian 116026, Peoples R China
来源
PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE | 2010年
关键词
DataMining; Clustering; Incremental Clustering; Clustering Ensemble; K-MEANS ALGORITHM;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
More and more data in practice is changing every minute and been collected in incremental mode, and incremental clustering has attracted much of researchers' attention. However, little research now focuses on partitioning categorical data in incremental mode. How to design incremental clustering for categorical data is an urgent problem. We propose an incremental clustering for categorical data using clustering ensemble in this paper. We firstly prune redundant attributes if needed, and then make use of true values of different attributes to form clustering memberships, and next use clustering ensemble to merge or divide clusters to gain optimal clustering. Finally, the proposed algorithm is applied in Yellow- Small dataset, Diagnosis dataset and Zoo dataset and results show that it is effective.
引用
收藏
页码:2519 / 2524
页数:6
相关论文
共 50 条
  • [41] The performance of objective functions for clustering categorical data
    Xiang, Zhengrong
    Islam, Md Zahidul
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8863 : 16 - 28
  • [42] Generalized Similarity Measure for Categorical Data Clustering
    Sharma, Shruti
    Singh, Manoj
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 765 - 769
  • [43] EnsCat: clustering of categorical data via ensembling
    Clarke, Bertrand S.
    Amiri, Saeid
    Clarke, Jennifer L.
    BMC BIOINFORMATICS, 2016, 17
  • [44] A CLUSTERING ALGORITHM FOR MIXED NUMERIC AND CATEGORICAL DATA
    Ohn Mar San
    Van-Nam Huynh
    Yoshiteru Nakamori
    JournalofSystemsScienceandComplexity, 2003, (04) : 562 - 571
  • [45] A new initialization method for clustering categorical data
    Wu, Shu
    Jiang, Qingshan
    Huang, Joshua Zhexue
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2007, 4426 : 972 - +
  • [46] Clustering categorical data based on distance vectors
    Zhang, P
    Wang, XG
    Song, PXK
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) : 355 - 367
  • [47] Clustering and variable selection for categorical multivariate data
    Bontemps, Dominique
    Toussile, Wilson
    ELECTRONIC JOURNAL OF STATISTICS, 2013, 7 : 2344 - 2371
  • [48] EnsCat: clustering of categorical data via ensembling
    Bertrand S. Clarke
    Saeid Amiri
    Jennifer L. Clarke
    BMC Bioinformatics, 17
  • [49] Incremental CFS Clustering on Large Data
    Zhao, Liang
    Chen, Zhikui
    Yang, Yi
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 687 - 690
  • [50] K-modestream algorithm for clustering categorical data streams
    Ravi Sankar Sangam
    Hari Om
    CSI Transactions on ICT, 2017, 5 (3) : 295 - 303