Weighted Fuzzy C-Means: Unsupervised Feature Selection to Realize a Target Partition

被引:0
作者
Sarkar, Kaushik [1 ]
Mudi, Rajani K. [2 ]
Pal, Nikhil R. [3 ]
机构
[1] Narula Inst Technol, Dept Elect & Commun Engn, Kolkata, India
[2] Jadavpur Univ, Dept Instrumentat & Elect Engn, Kolkata, India
[3] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata, India
关键词
Fuzzy C-Means; weighted FCM; regularizer; feature selection; clustering; target partition; ALGORITHM;
D O I
10.1142/S0218488524500260
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce an unsupervised feature selection method based on regularized weighted Fuzzy C-Means (WRFCM) clustering. When the target task is clustering, our objective should be to select a subset of features that can generate the same/similar partition matrix to the partition matrix obtained from the original high dimensional data by a clustering algorithm. To achieve this we propose a novel objective function keeping in view the Fuzzy-C-Means (FCM) clustering algorithm. This approach realizes feature selection within the WRFCM framework, emphasizing features to maintain the FCM-based target partition. We evaluate our method using Normalized Mutual Information (NMI), Adjusted Rand Index (ARI) and Kuhn-Munkres index (KM-index). NMI, and ARI measure the agreement between clusters, i.e, the partition in the lower dimension and the partition of the original data. On the other hand, KM-index measures the disagreement between the two partitions. Experimental results on synthetic and real datasets showcase our method's efficacy in selecting informative features. This approach fills a crucial gap in unsupervised feature selection, making it valuable for real-world applications. The approach is very general in the sense that the target partition can be generated by any clustering algorithm or even by the actual class labels of the data, when they are available.
引用
收藏
页码:1111 / 1134
页数:24
相关论文
共 58 条
[1]   Identification of a small set of plasma signalling proteins using neural network for prediction of Alzheimer's disease [J].
Agarwal, Swapna ;
Ghanty, Pradip ;
Pal, Nikhil R. .
BIOINFORMATICS, 2015, 31 (15) :2505-2513
[2]  
[Anonymous], About us
[3]  
[Anonymous], 1999, Fuzzy cluster analysis: methods for classification, data analysis and image recognition
[4]   Unsupervised Feature Selection with Controlled Redundancy (UFeSCoR) [J].
Banerjee, Monami ;
Pal, Nikhil R. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (12) :3390-3403
[5]  
Bezdek, 1999, Springer Science Business Media
[6]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[7]  
Bilenko M., 2004, P 21 INT C MACH LEAR, V11
[8]   Selecting useful groups of features in a connectionist framework [J].
Chakraborty, Debrup ;
Pal, Nikhil R. .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (03) :381-396
[9]   Feature Selection Using a Neural Framework With Controlled Redundancy [J].
Chakraborty, Rudrasis ;
Pal, Nikhil R. .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (01) :35-50
[10]   An optimization algorithm for clustering using weighted dissimilarity measures [J].
Chan, EY ;
Ching, WK ;
Ng, MK ;
Huang, JZ .
PATTERN RECOGNITION, 2004, 37 (05) :943-952