An adaptive kernelized rank-order distance for clustering non-spherical data with high noise

被引:64
作者
Huang, Tianyi [1 ]
Wang, Shiping [2 ]
Zhu, William [1 ]
机构
[1] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Peoples R China
[2] Fuzhou Univ, Coll Math & Comp Sci, Fuzhou, Peoples R China
关键词
Unsupervised learning; Clustering; Rank-order; Kernel similarity; Non-spherical data; Noise; LINKAGE;
D O I
10.1007/s13042-020-01068-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a fundamental research topic in unsupervised learning. Similarity measure is a key factor for clustering. However, it is still challenging for existing similarity measures to cluster non-spherical data with high noise levels. Rank-order distance is proposed to well capture the structures of non-spherical data by sharing the neighboring information of the samples, but it cannot well tolerate high noise. In order to address above issue, we propose KROD, a new similarity measure incorporating rank-order distance with Gaussian kernel. By reducing the noise in the neighboring information of samples, KROD improves rank-order distance to tolerate high noise, thus the structures of non-spherical data with high noise levels can be well captured. Then, KROD strengthens these captured structures by Gaussian kernel so that the samples in the same cluster are closer to each other and can be easily clustered correctly. Experiment illustrates that KROD can effectively improve existing methods for discovering non-spherical clusters with high noise levels. The source code can be downloaded from .
引用
收藏
页码:1735 / 1747
页数:13
相关论文
共 55 条
[51]  
Zhang T., 1996, SIGMOD Rec., V25, P103
[52]  
Zhang W, 2012, LECT NOTES COMPUT SC, V7572, P428, DOI 10.1007/978-3-642-33718-5_31
[53]   Agglomerative clustering via maximum incremental path integral [J].
Zhang, Wei ;
Zhao, Deli ;
Wang, Xiaogang .
PATTERN RECOGNITION, 2013, 46 (11) :3056-3065
[54]  
Zheng Zhuo Zheng Zhuo, 2010, Guizhou Agricultural Sciences, P1
[55]  
Zhi CH, 2011, PROC CVPR IEEE, P481, DOI 10.1109/CVPR.2011.5995680