An Incremental Clustering with Attribute Unbalance Considered for Categorical Data

被引:0
作者
Chen, Jize [2 ]
Yang, Zhimin [1 ]
Yin, Jian [2 ]
Yang, Xiaobo [1 ]
Huang, Li [1 ]
机构
[1] Guangzhou Univ Chinese Med, Affiliated Hosp 2, Guangzhou 510120, Peoples R China
[2] Sun Yat Sen Univ, Sch Informat Sci & Technol, Guangzhou 510275, Guangdong, Peoples R China
来源
COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS | 2009年 / 51卷
基金
中国国家自然科学基金;
关键词
incremental clustering; attribute unbalance; categorical data;
D O I
10.1007/978-3-642-04962-0_50
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering analysis is an important technique used in many fields. But traditional clustering algorithms generally deal with numeric data. While clustering categorical data have always attracted researchers' attentions because of their prevalence in real life. This paper analyses limitations of the categorical clustering algorithms proposed. Based on two observations, a new similarity measure is proposed for categorical data which considers the unbalance of attributes. As the data are getting much larger and more dynamic, incremental is an important quality of good clustering algorithms. The clustering algorithm present is an incremental with linear computing complexity. The experiment results indicate that it outperforms other categorical clustering algorithms referred in the paper.
引用
收藏
页码:433 / +
页数:2
相关论文
共 50 条
  • [41] Clustering Categorical Data:A Cluster Ensemble Approach
    何增友
    High Technology Letters, 2003, (04) : 8 - 12
  • [42] Coercion: A Distributed Clustering Algorithm for Categorical Data
    Wang, Bin
    Zhou, Yang
    Hei, Xinhong
    2013 9TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2013, : 683 - 687
  • [43] Hierarchical division clustering framework for categorical data
    Wei, Wei
    Liang, Jiye
    Guo, Xinyao
    Song, Peng
    Sun, Yijun
    NEUROCOMPUTING, 2019, 341 : 118 - 134
  • [44] Rough Set Approach for Categorical Data Clustering
    Herawan, Tutut
    Yanto, Iwan Tri Riyadi
    Deris, Mustafa Mat
    DATABASE THEORY AND APPLICATION, 2009, 64 : 179 - 186
  • [45] An Incremental Clustering of Gene Expression data
    Das, Rosy
    Bhattacharyya, Dhruba K.
    Kalita, Jugal K.
    2009 WORLD CONGRESS ON NATURE & BIOLOGICALLY INSPIRED COMPUTING (NABIC 2009), 2009, : 741 - +
  • [46] Incremental CFS Clustering on Large Data
    Zhao, Liang
    Chen, Zhikui
    Yang, Yi
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 687 - 690
  • [47] A Roughset Based Data Labeling Method for Clustering Categorical Data
    Reddy, H. Venkateswara
    Raju, S. Viswanadha
    2014 3RD INTERNATIONAL CONFERENCE ON ECO-FRIENDLY COMPUTING AND COMMUNICATION SYSTEMS (ICECCS 2014), 2014, : 51 - 55
  • [48] Detecting outliers in categorical data through rough clustering
    Suri, N. N. R. Ranga
    Murty, M. Narasimha
    Athithan, G.
    NATURAL COMPUTING, 2016, 15 (03) : 385 - 394
  • [49] Soft subspace clustering of categorical data with probabilistic distance
    Chen, Lifei
    Wang, Shengrui
    Wang, Kaijun
    Zhu, Jianping
    PATTERN RECOGNITION, 2016, 51 : 322 - 332
  • [50] Integrated Rough Fuzzy Clustering for Categorical data Analysis
    Saha, Indrajit
    Sarkar, Jnanendra Prasad
    Maulik, Ujjwal
    FUZZY SETS AND SYSTEMS, 2019, 361 : 1 - 32