Fast adaptive clustering by synchronization on large scale datasets

被引:0
作者
Ying, Wenhao [1 ,2 ]
Xu, Min [1 ]
Wang, Shitong [1 ]
Deng, Zhaohong [1 ]
机构
[1] School of Digital Media, Jiangnan University, Wuxi
[2] School of Computer Science and Engineering, Changshu Institute of Technology, Changshu
来源
Ying, W. (cslgywh@163.com) | 1600年 / Science Press卷 / 51期
关键词
Clustering; Kernel density estimation; Minimal enclosing ball; Reduced set density estimator; Synchronization;
D O I
10.7544/issn1000-1239.2014.20120909
中图分类号
学科分类号
摘要
The existing synchronization clustering algorithm Sync regards each attribute of a sample as a phase oscillator in the synchronization process. As a result, the algorithm has higher time complexity and can not be well used on large scale datasets. To solve this problem, we propose a novel fast adaptive clustering algorithm FAKCS in this paper. Firstly, FAKCS introduces a method based on RSDE and CCMEB technology to extract the samples from the original dataset. Then it begins clustering adaptively by using the Davies-Bouldin cluster criterion and the new order parameter which can observe the degree of local synchronization. Moreover, the relationship between the new order parameter and KDE is found in this paper, which reveals the probability density nature of local synchronization. FAKCS can detect clusters of arbitrary shape, number and density on large scale datasets without setting cluster number previously. The effectiveness of the proposed method has been demonstrated in image segmentation examples and experiments on large UCI datasets.
引用
收藏
页码:707 / 720
页数:13
相关论文
共 20 条
  • [1] Jain A.K., Murty M.N., Flynn P.J., Data clustering: A review, ACM Computing Surveys, 31, 3, pp. 264-323, (1999)
  • [2] Sun J., Liu J., Zhao L., Clustering algorithms research, Journal of Software, 19, 1, pp. 48-61, (2008)
  • [3] Wang J., Wang S., Deng Z., Survey on challenges in clustering analysis research, Control and Decision, 27, 3, pp. 321-328, (2012)
  • [4] Bohm C., Plant C., Shao J., Et al., Clustering by synchronization, Proc of the 16th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining, pp. 583-592, (2010)
  • [5] Kim J., Scott C.D., L<sub>2</sub> kernel classification, IEEE Trans on Pattern Analysis and Machine Intelligence, 32, 10, pp. 1822-1831, (2010)
  • [6] Freedman D., Kisilev P., Fast data reduction via KDE approximation, Proc of 2009 Data Compression Conference, pp. 445-445, (2009)
  • [7] Chao H., Girolami M., Novelty detection employing an L2 optimal non-parametric density estimator, Pattern Recognition Letters, 25, 12, pp. 1389-1397, (2004)
  • [8] Li C., Sun Z., Chen G., Et al., Kernel density estimation and its application to clustering algorithm construction, Journal of Computer Research and Development, 41, 10, pp. 1712-1719, (2004)
  • [9] Zhang T., Zheng Z., Synchronization of coupled limit-cycle systems, Acta Physica Sinica, 53, 10, pp. 3287-3291, (2004)
  • [10] Moreno Y., Pacheco A.F., Synchronization of Kuramo to oscillators in scale-free networks, Euro Physics Letters, 68, 4, pp. 603-609, (2004)