A modified Fuzzy k-Partition based on indiscernibility relation for categorical data clustering

被引:17
作者
Yanto, Iwan Tri Riyadi [1 ]
Ismail, Maizatul Akmar [2 ]
Herawan, Tutut [2 ]
机构
[1] Univ Ahmad Dahlan, Dept Informat Syst, Yogyakarta, Indonesia
[2] Univ Malaya, Dept Informat Syst, Kuala Lumpur 50603, Malaysia
关键词
Clustering; Categorical data; Fuzzy k-Partition; Indescernibility relation; ALGORITHM; MODEL;
D O I
10.1016/j.engappai.2016.01.026
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Categorical data clustering has been adopted by many scientific communities to classify objects from large databases. In order to classify the objects, Fuzzy k-Partition approach has been proposed for categorical data clustering. However, existing Fuzzy k-Partition approaches suffer from high computational time and low clustering accuracy. Moreover, the parameter maximize of the classification likelihood function in Fuzzy k-Partition approach will always have the same categories, hence producing the same results. To overcome these issues, we propose a modified Fuzzy k-Partition based on indiscernibility relation. The indiscernibility relation induces an approximation space which is constructed by equivalence classes of indiscernible objects, thus it can be applied to classify categorical data. The novelty of the proposed approach is that unlike previous approach that use the likelihood function of multivariate multinomial distributions, the proposed approach is based on indescernibility relation. We performed an extensive theoretical analysis of the proposed approach to show its effectiveness in achieving lower computational complexity. Further, we compared the proposed approach with Fuzzy Centroid and Fuzzy k-Partition approaches in terms of response time and clustering accuracy on several UCI benchmark and real world datasets. The results show that the proposed approach achieves lower response time and higher clustering accuracy as compared to other Fuzzy k-based approaches. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:41 / 52
页数:12
相关论文
共 50 条
[41]   Categorical data clustering: A correlation-based approach for unsupervised attribute weighting [J].
Carbonera, Joel Luis ;
Abel, Mara .
2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, :259-263
[42]   From Whole to Part: Reference-Based Representation for Clustering Categorical Data [J].
Zheng, Qibin ;
Diao, Xingchun ;
Cao, Jianjun ;
Liu, Yi ;
Li, Hongmei ;
Yao, Junnan ;
Chang, Chen ;
Lv, Guojun .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (03) :927-937
[43]   A Modified Hybrid Fuzzy Clustering Method for Big Data [J].
Khoshkbarchi, Amir ;
Kamali, Ali ;
Amjadi, Mehdi ;
Haeri, Maryam Amir .
2016 8TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2016, :196-201
[45]   A fuzzy SV-k-modes algorithm for clustering categorical data with set-valued attributes [J].
Cao, Fuyuan ;
Huang, Joshua Zhexue ;
Liang, Jiye .
APPLIED MATHEMATICS AND COMPUTATION, 2017, 295 :1-15
[46]   Categorical Data Classification based on Fuzzy K-Nearest Neighbor Approach [J].
Rustamaji, Heru Cahya ;
Simanjuntak, Oliver Samuel ;
Luhrie, Shalfa Fitriga ;
Yuwono, Bambang ;
Juwairiah .
2019 5TH INTERNATIONAL CONFERENCE ON SCIENCE ININFORMATION TECHNOLOGY (ICSITECH): EMBRACING INDUSTRY 4.0 - TOWARDS INNOVATION IN CYBER PHYSICAL SYSTEM, 2019, :171-175
[47]   Efficient layered density-based clustering of categorical data [J].
Andreopoulos, Bill ;
An, Aijun ;
Wang, Xiaogang ;
Labudde, Dirk .
JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (02) :365-376
[48]   Model-Based Clustering for Conditionally Correlated Categorical Data [J].
Marbac, Matthieu ;
Biernacki, Christophe ;
Vandewalle, Vincent .
JOURNAL OF CLASSIFICATION, 2015, 32 (02) :145-175
[49]   Model-Based Clustering for Conditionally Correlated Categorical Data [J].
Matthieu Marbac ;
Christophe Biernacki ;
Vincent Vandewalle .
Journal of Classification, 2015, 32 :145-175
[50]   A Hierarchical Clustering for Categorical Data Based on Holo-entropy [J].
Sun, Haojun ;
Chen, Rongbo ;
Jin, Shulin ;
Qin, Yong .
2015 12TH WEB INFORMATION SYSTEM AND APPLICATION CONFERENCE (WISA), 2015, :269-274