K-means - Laplacian clustering revisited

被引:9
作者
Rengasamy, Sundar [1 ]
Murugesan, Punniyamoorthy [1 ]
机构
[1] Natl Inst Technol Tiruchirappalli, Tiruchirappalli 620015, Tamil Nadu, India
关键词
K-means clustering; Spectral clustering; Kernel functions; Similarity matrix; Relation information; Laplacian matrix;
D O I
10.1016/j.engappai.2021.104535
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, clustering the real-life data using attribute information alone is not enough because the data come from various resources and may have data object relations. Thus, it enforces to include the relation information of the data while clustering. Knowing its importance, many researchers have used it along with attribute information. The Integrated K-means Laplacian (IKL) algorithm is one such type that integrates the attribute and pair-wise relations to cluster the data. It is well known for its way of clustering the data. However, it has issues in the creation of the normalized Laplacian matrix. The current study proposes three different ways of creating the normalized Laplacian matrix to rectify those issues. Based on these modifications, three new variants of the IKL algorithm are produced. Besides, the pair-wise similarity matrix (W) is another crucial element in the IKL algorithm. Earlier, the Gaussian function was used to create W in IKL, whereas this study proposes 12 different kernel functions to form W instead. Their influences on the existing and proposed algorithms' performance are studied. Nine benchmark datasets are used to demonstrate the same. Further, the performances of proposed algorithms are compared with existing algorithms in recent literature by using the seven clustering evaluation metrics and running time of algorithms. The comparison studies reveal that the proposed modifications to the IKL algorithm are significant, and the statistical tests prove the same. Besides, an analysis is carried out by replacing the XX' matrix with kernel functions, and the improvements in the performances are studied.
引用
收藏
页数:12
相关论文
共 25 条
[1]  
[Anonymous], 2001, P INT JOINT C ART IN
[2]  
Bhattacharya I., 2008, CONSTRAINED CLUSTERI, DOI [10.1201/9781584889977.ch10, DOI 10.1201/9781584889977.CH10]
[3]   Discriminative K-Means Laplacian Clustering [J].
Chao, Guoqing .
NEURAL PROCESSING LETTERS, 2019, 49 (01) :393-405
[4]   Block spectral clustering for multiple graphs with inter-relation [J].
Chen C. ;
Ng M. ;
Zhang S. .
Network Modeling Analysis in Health Informatics and Bioinformatics, 2017, 6 (01)
[5]  
Han J, 2012, MOR KAUF D, P1
[6]   Automatic topic identification using webpage clustering [J].
He, XF ;
Ding, CHQ ;
Zha, HY ;
Simon, HD .
2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, :195-202
[7]   Multi-view spectral clustering via integrating nonnegative embedding and spectral embedding [J].
Hu, Zhanxuan ;
Nie, Feiping ;
Wang, Rong ;
Li, Xuelong .
INFORMATION FUSION, 2020, 55 :251-259
[8]   Data clustering: 50 years beyond K-means [J].
Jain, Anil K. .
PATTERN RECOGNITION LETTERS, 2010, 31 (08) :651-666
[9]   Structured Graph Learning for Scalable Subspace Clustering: From Single View to Multiview [J].
Kang, Zhao ;
Lin, Zhiping ;
Zhu, Xiaofeng ;
Xu, Wenbo .
IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (09) :8976-8986
[10]   Structured graph learning for clustering and semi-supervised classification [J].
Kang, Zhao ;
Peng, Chong ;
Cheng, Qiang ;
Liu, Xinwang ;
Peng, Xi ;
Xu, Zenglin ;
Tian, Ling .
PATTERN RECOGNITION, 2021, 110