Automated Attribute Weighting Fuzzy k-Centers Algorithm for Categorical Data Clustering

被引：4

作者：

Mau, Toan Nguyen ^{[1
]}

Huynh, Van-Nam ^{[1
]}

机构：

[1] Japan Adv Inst Sci & Technol, Sch Adv Sci & Technol, Nomi, Ishikawa, Japan

来源：

MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2021) | 2021年 / 12898卷

关键词：

Fuzzy clustering; Categorical data; k-representatives; k-centers; MODES ALGORITHM;

D O I：

10.1007/978-3-030-85529-1_17

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Cluster analysis plays an important role in exploring the correlations in data by dividing datasets into separate clusters so that similar objects are located in the same cluster. Moreover, fuzzy cluster analysis can reveal the mixtures of clusters in datasets containing multiple distributions. Certainly, the outcome of clustering methods is approximately determined by the similarity definition. Thus, the similarity measurement is exceedingly important to the formation of fuzzy clusters. In fact, the similarity between two objects is mostly calculated by the mean of differences across multiple dimensions. However, the dissimilarity in some dimensions has little or no effect on the fuzzy clustering outcome. In this study, we explore such impacts for fuzzy clustering of data with categorical attributes. Accordingly, the impact of each attribute on each fuzzy cluster is calculated using an optimizer, and the overlapping dissimilar values are then adjusted by the corresponding weights. We propose to apply this approach to the Fk-centers clustering algorithm, and the experimental results show that our proposed method can achieve higher fuzzy silhouette scores than other related works. These results demonstrate the applicability of deploying of the proposed method in real-world application.

引用

页码：205 / 217

页数：13

共 22 条

[1] [Anonymous], 2013, IJCAI
[2] A fuzzy extension of the silhouette width criterion for cluster analysis
Campello, R. J. G. B.
Hruschka, E. R.
[J]. FUZZY SETS AND SYSTEMS, 2006, 157 (21) : 2858 - 2875
[3] Dehariya V. K., 2010, Proceedings of the 2010 International Conference on Computational Intelligence and Communication Networks (CICN 2010), P386, DOI 10.1109/CICN.2010.80
[4] Frank A., 2011, Uci machine learning repository, V15, P22
[5] A genetic fuzzy k-Modes algorithm for clustering categorical data
Gan, G.
Wu, J.
Yang, Z.
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) : 1615 - 1620
[6] Ghosh S, 2013, INT J ADV COMPUT SC, V4, P35
[7] Extensions to the k-means algorithm for clustering large data sets with categorical values
Huang, ZX
[J]. DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (03) : 283 - 304
[8] A fuzzy k-modes algorithm for clustering categorical data
Huang, ZX
Ng, MK
[J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) : 446 - 452
[9] Fuzzy clustering of categorical data using fuzzy centroids
Kim, DW
Lee, KH
Lee, D
[J]. PATTERN RECOGNITION LETTERS, 2004, 25 (11) : 1263 - 1271
[10] Li Q., 2007, Nonparametric Econometrics: Theory and Practice

← 1 2 3 →