A local cores-based hierarchical clustering algorithm for data sets with complex structures

被引:36
作者
Cheng, Dongdong [1 ]
Zhu, Qingsheng [1 ]
Huang, Jinlong [2 ]
Wu, Quanwang [1 ]
Yang, Lijun [3 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing, Peoples R China
[2] Yangtze Normal Univ, Coll Comp Engn, Chongqing, Peoples R China
[3] Southwest Minzu Univ, Sch Comp Sci & Technol, Chengdu, Sichuan, Peoples R China
关键词
Hierarchical clustering; Local cores; Complex structures; ROBUST;
D O I
10.1007/s00521-018-3641-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Hierarchical clustering is of great importance in data analysis. Although there are a number of hierarchical clustering algorithms including agglomerative methods, divisive methods and hybrid methods, most of them are sensitive to noise points, suffer from high computational cost and cannot effectively discover clusters with complex structures. When recognizing patterns from complex structures, humans intuitively tend to discover obvious clusters in dense regions firstly and then deal with objects on the border. Inspired by this idea, we propose a local cores-based hierarchical clustering algorithm called HCLORE. The proposed method first partitions the data set into several clusters by finding local cores, instead of optimizing an objective function through iteration like K-means; then temporarily removes points with lower local density, so that the boundary between clusters is clearer; after that merges clusters according to a newly defined similarities between clusters; and finally points with lower local density are assigned to the same clusters as their local cores belong to. The experimental results on synthetic data sets and real data sets show that our algorithm is more effective and efficient than existing methods when processing data sets with complex structures.
引用
收藏
页码:8051 / 8068
页数:18
相关论文
共 37 条
[1]   Clustering aggregation [J].
Yahoo Research Labs., Barcelona ;
不详 ;
不详 ;
不详 .
ACM Trans. Knowl. Discov. Data, 2007, 1
[2]   Efficient agglomerative hierarchical clustering [J].
Bouguettaya, Athman ;
Yu, Qi ;
Liu, Xumin ;
Zhou, Xiangmin ;
Song, Andy .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) :2785-2797
[3]   Robust path-based spectral clustering [J].
Chang, Hong ;
Yeung, Dit-Yan .
PATTERN RECOGNITION, 2008, 41 (01) :191-203
[4]   Parallel Spectral Clustering in Distributed Systems [J].
Chen, Wen-Yen ;
Song, Yangqiu ;
Bai, Hongjie ;
Lin, Chih-Jen ;
Chang, Edward Y. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (03) :568-586
[5]  
Chen Y, 2016, INF SCI
[6]   Natural neighbor-based clustering algorithm with local representatives [J].
Cheng, Dongdong ;
Zhu, Qingsheng ;
Huang, Jinlong ;
Yang, Lijun ;
Wu, Quanwang .
KNOWLEDGE-BASED SYSTEMS, 2017, 123 :238-253
[7]   Study on density peaks clustering based on k-nearest neighbors and principal component analysis [J].
Du, Mingjing ;
Ding, Shifei ;
Jia, Hongjie .
KNOWLEDGE-BASED SYSTEMS, 2016, 99 :135-145
[8]  
Ester M., 1996, P 2 INT C KNOWL DISC, V96, P226, DOI DOI 10.5555/3001460.3001507
[9]   Clustering by passing messages between data points [J].
Frey, Brendan J. ;
Dueck, Delbert .
SCIENCE, 2007, 315 (5814) :972-976
[10]   FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data [J].
Fu, Limin ;
Medico, Enzo .
BMC BIOINFORMATICS, 2007, 8