Sparsity Fuzzy C-Means Clustering With Principal Component Analysis Embedding

被引：13

作者：

Chen, Jingwei ^{[1
,2
]}

Zhu, Jianyong ^{[1
,2
]}

Jiang, Hongyun ^{[5
]}

Yang, Hui ^{[1
,2
]}

Nie, Feiping ^{[3
,4
]}

机构：

[1] East China Jiaotong Univ, Sch Elect & Automat Engn, Nanchang 330013, Peoples R China

[2] East China Jiaotong Univ, Key Lab Adv Control & Optimizat Jiangxi Prov, Nanchang 330013, Peoples R China

[3] Northwestern Polytech Univ, Sch Comp Sci, Sch Artificial Intelligence Opt & ElectroN iOPEN, Xian 710072, Peoples R China

[4] Northwestern Polytech Univ, Minist Ind & Informat Technol, Key Lab Intelligent Interact & Applicat, Xian 710072, Peoples R China

[5] China Railway Conservancy & Hydropower Planning &, Nanchang 330029, Peoples R China

来源：

IEEE TRANSACTIONS ON FUZZY SYSTEMS | 2023年 / 31卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Principal component analysis; Clustering methods; Robustness; Feature extraction; Clustering algorithms; Data mining; Dimensionality reduction; Clustering; dimensionality reduction; fuzzy c-means (FCM); outliers; principal component analysis (PCA); sparsity; K-MEANS; ALGORITHM; FCM;

D O I：

10.1109/TFUZZ.2022.3217343

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The clustering method has been widely used in data mining, pattern recognition, and image identification. Fuzzy c-means (FCM) is a soft clustering method that introduces the concept of membership. In this method, the fuzzy membership matrix is obtained by calculating the distance between data points in the original space. However, these methods may yield suboptimal results owing to the influence of redundant features. Moreover, FCM is always sensitive to noise points and heavily subject to outliers. In this article, we propose a method called sparsity FCM clustering with principal component analysis embedding (P_SFCM). We simultaneously conduct principal component analysis and membership learning, and then add an additional weighting factor for each data point. The goal of this operation is to identify the noise or outliers. Overall, the benefit of our framework is that it retains most of the information in the subspace while improving the robustness of the noise. In this article, we employ an iterative optimization algorithm to efficiently solve our model. To verify the reliability of the proposed method, we conduct a convergence analysis, noise robustness analysis, and multicluster experiments. Furthermore, comparative experiments are conducted on both synthetic and real benchmark datasets. The experimental results show that the P_SFCM is competitive with comparable methods.

引用

页码：2099 / 2111

页数：13

共 40 条

[1] Generalized Possibilistic Fuzzy C-Means with novel cluster validity indices for clustering noisy data [J].

Askari, S. ;

Montazerin, N. ;

Zarandi, M. H. Fazel .

APPLIED SOFT COMPUTING, 2017, 53 :262-283

[2]

Balakrishnama S., 1998, Inst. Signal Inf. Process, V18, P1

[3] CONVERGENCE THEORY FOR FUZZY C-MEANS - COUNTEREXAMPLES AND REPAIRS [J].

BEZDEK, JC ;

HATHAWAY, RJ ;

SABIN, MJ ;

TUCKER, WT .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1987, 17 (05) :873-877

[4] FCM - THE FUZZY C-MEANS CLUSTERING-ALGORITHM [J].

BEZDEK, JC ;

EHRLICH, R ;

FULL, W .

COMPUTERS & GEOSCIENCES, 1984, 10 (2-3) :191-203

[5] CONVERGENCE THEOREM FOR THE FUZZY ISODATA CLUSTERING ALGORITHMS [J].

BEZDEK, JC .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1980, 2 (01) :1-8

[6]

Can GJ, 2006, LECT NOTES ARTIF INT, V4093, P271

[7] Fuzzy clustering of fuzzy data based on robust loss functions and ordered weighted averaging [J].

D'Urso, Pierpaolo ;

Leski, Jacek M. .

FUZZY SETS AND SYSTEMS, 2020, 389 :1-28

[8]

Ding C., 2004, Proceedings of the TwentyFirst International Conference on Machine Learning ICML 04, P29, DOI DOI 10.1145/1015330.1015408

[9]

Ding Chris, 2007, P 24 INT C MACH LEAR

[10] Kernel-based fuzzy c-means clustering algorithm based on genetic algorithm [J].

Ding, Yi ;

Fu, Xian .

NEUROCOMPUTING, 2016, 188 :233-238

← 1 2 3 4 →