Fuzzy Divergence Weighted Ensemble Clustering With Spectral Learning Based on Random Projections for Big Data

被引:1
作者
Lahmar, Ines [1 ]
Zaier, Aida [2 ]
Yahia, Mohamed [3 ]
Ali, Tarig [4 ]
Boaullegue, Ridha [2 ]
机构
[1] Univ Gabes, MACS Lab, Gabes 6029, Tunisia
[2] Univ Carthage Tunis, InnovCom Lab, Tunis 1002, Tunisia
[3] Univ Tunis El Manar, ENIT, SYSCOM Lab, Tunis 1002, Tunisia
[4] Amer Univ Sharjah, GIS & Mapping Lab, Sharjah, U Arab Emirates
关键词
Matrix converters; Entropy; Clustering algorithms; Uncertainty; Reliability; Weight measurement; Sparse matrices; Ensemble learning; Fuzzy systems; Spectral analysis; Fuzzy ensemble clustering; high-dimensional data; random projection; Kullback-Leibler divergence entropy; spectral learning; FACE-RECOGNITION;
D O I
10.1109/ACCESS.2024.3359299
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In many real-world applications, data are described by high-dimensional feature spaces, posing new challenges for current ensemble clustering methods. The goal is to combine sets of base clusters to enhance clustering accuracy, but this makes them susceptible to low quality. However, the reliability of present ensemble clustering in high-dimensional data still needs improvement. In this context, we propose a new fuzzy divergence-weighted ensemble clustering based on random projection and spectral learning. Firstly, random projection (RP) is used to create various dimensional data and find membership matrices via fuzzy c-means (FCM). Secondly, fuzzy partitions of random projections are ranked using entropy-based local weighting along with Kullback-Leibler (KL) divergence to detect any uncertainty. Then it used to evaluate the weight of each cluster. Finally, we create regularized graphs from these membership matrices and use spectral matrices to estimate the affinity matrices of these graphs using fuzzy KL divergence anchor graphs. Subsequently, obtaining the final clustering results is considered as an optimization problem, and the ensemble clustering results are obtained. The experimental results on high-dimensional data demonstrate the efficiency of our method compared to state-of-the-art methods.
引用
收藏
页码:20197 / 20208
页数:12
相关论文
共 31 条
[1]  
Achlioptas Dimitris, 2001, P 20 ACM SIGMOD SIGA, P274, DOI DOI 10.1145/375551.375608
[2]  
[Anonymous], 2010, PROC INT C MACH LEAR
[3]  
Chung F.R.K., 1997, Spectral Graph Theory, DOI 10.1090/cbms/092
[4]   From few to many: Illumination cone models for face recognition under variable lighting and pose [J].
Georghiades, AS ;
Belhumeur, PN ;
Kriegman, DJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (06) :643-660
[5]  
Hond D., 1997, BMVC, P4
[6]   Locally Weighted Ensemble Clustering [J].
Huang, Dong ;
Wang, Chang-Dong ;
Lai, Jian-Huang .
IEEE TRANSACTIONS ON CYBERNETICS, 2018, 48 (05) :1460-1473
[7]   A DATABASE FOR HANDWRITTEN TEXT RECOGNITION RESEARCH [J].
HULL, JJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1994, 16 (05) :550-554
[8]   Weighted Cluster Ensemble Based on Partition Relevance Analysis With Reduction Step [J].
Ilc, Nejc .
IEEE ACCESS, 2020, 8 :113720-113736
[9]   Optimality of the Johnson-Lindenstrauss lemma [J].
Larsen, Kasper Green ;
Nelson, Jelani .
2017 IEEE 58TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2017, :633-638
[10]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324