Grid-based clustering over an evolving data stream

被引:5
作者
Wan, Renxia [1 ]
Chen, Jingchao [1 ]
Wang, Lixin [1 ]
Su, Xiaoke [1 ]
机构
[1] Donghua Univ, Coll Informat Sci & Technol, Shanghai 201620, Peoples R China
关键词
clustering; data stream; grid clique; neighbouring grid; boundary grids; merging; acceptable distance; grid characteristic information;
D O I
10.1504/IJDMMM.2009.029033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering on data stream has a great challenge because it has to be implemented within a limited space and a strict time constraint and the data stream may be potentially infinite. Fortunately, many clustering algorithms for data stream have been proposed, these algorithms have greatly promoted the clustering level of data stream, but most of the algorithms are designed for convex clusters. In this paper, a grid-based clustering algorithm is presented, it maps every data into the corresponding grid firstly and then iteratively merges these grids into clusters via merging steps, only boundary grids are considered during the merging stage. The algorithm also can group the evolving data stream into arbitrary shaped clusters. Compared with the same categorical algorithms, it has a less parameters input. In terms of effectivity and efficiency, the proposed algorithm outperforms the same categorical ones from theoretical and experimental analysis.
引用
收藏
页码:393 / 410
页数:18
相关论文
共 18 条
[1]  
Aggarwal CC, 2003, PROC 29 INT C VERY L, P81, DOI 10.1016/b978-012722442-8/50016-1
[2]  
Aggarwal CC, 2004, P 30 INT C VER LARG, V30, P852, DOI DOI 10.1016/B978-012088469-8.50075-9
[3]  
Aggarwal CC, 2006, SIAM PROC S, P479
[4]  
Babcock B, 2002, PROC 21 ACM SIGMODSI, P1, DOI DOI 10.1145/543613.543615
[5]  
Bhatnagar V, 2007, LECT NOTES COMPUT SC, V4653, P629
[6]   Density-Based Clustering over an Evolving Data Stream with Noise [J].
Cao, Feng ;
Ester, Martin ;
Qian, Weining ;
Zhou, Aoying .
PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, :328-+
[7]  
Chen YX, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P133
[8]  
Ester M., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P226
[9]  
Gaber MM, 2005, SIGMOD REC, V34, P18, DOI 10.1145/1083784.1083789
[10]  
Guha S., 2000, P IEEE S FDN COMP SC, P71