An adaptive kernelized rank-order distance for clustering non-spherical data with high noise

被引:64
作者
Huang, Tianyi [1 ]
Wang, Shiping [2 ]
Zhu, William [1 ]
机构
[1] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Peoples R China
[2] Fuzhou Univ, Coll Math & Comp Sci, Fuzhou, Peoples R China
关键词
Unsupervised learning; Clustering; Rank-order; Kernel similarity; Non-spherical data; Noise; LINKAGE;
D O I
10.1007/s13042-020-01068-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a fundamental research topic in unsupervised learning. Similarity measure is a key factor for clustering. However, it is still challenging for existing similarity measures to cluster non-spherical data with high noise levels. Rank-order distance is proposed to well capture the structures of non-spherical data by sharing the neighboring information of the samples, but it cannot well tolerate high noise. In order to address above issue, we propose KROD, a new similarity measure incorporating rank-order distance with Gaussian kernel. By reducing the noise in the neighboring information of samples, KROD improves rank-order distance to tolerate high noise, thus the structures of non-spherical data with high noise levels can be well captured. Then, KROD strengthens these captured structures by Gaussian kernel so that the samples in the same cluster are closer to each other and can be easily clustered correctly. Experiment illustrates that KROD can effectively improve existing methods for discovering non-spherical clusters with high noise levels. The source code can be downloaded from .
引用
收藏
页码:1735 / 1747
页数:13
相关论文
共 55 条
[1]  
[Anonymous], 1996, Columbia object image library(COIL-100)
[2]  
[Anonymous], 1996, COLUMBIA OBJECT IMAG
[3]  
Ashby F., 2007, SCHOLARPEDIA, V2, P4116, DOI DOI 10.4249/SCHOLARPEDIA.4116
[4]  
Cai D, 2009, 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, P1010
[5]   A new similarity combining reconstruction coefficient with pairwise distance for agglomerative clustering [J].
Cai, Zhiling ;
Yang, Xiaofei ;
Huang, Tianyi ;
Zhu, William .
INFORMATION SCIENCES, 2020, 508 :173-182
[6]  
Chen K, 2018, PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND COMPUTER AIDED EDUCATION (ICISCAE 2018), P426, DOI 10.1109/ICISCAE.2018.8666829
[7]  
CHEN XL, 2011, P 25 AAAI C ART INT, V5, P314
[8]   MEAN SHIFT, MODE SEEKING, AND CLUSTERING [J].
CHENG, YZ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1995, 17 (08) :790-799
[9]  
CHOATE TR, 1939, CLUSTER ANAL CORRELA
[10]  
Cox Trevor F., 2000, Multidimensional Scaling, VSecond