Efficient incremental density-based algorithm for clustering large datasets

被引:31
作者
Bakr, Ahmad M. [1 ]
Ghanem, Nagia M. [1 ]
Ismail, Mohamed A. [1 ]
机构
[1] Univ Alexandria, Fac Engn, Comp & Syst Engn Dept, Alexandria, Egypt
关键词
Incremental clustering; Density-based clustering; Document clustering; Information retrieval;
D O I
10.1016/j.aej.2015.08.009
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In dynamic information environments such as the web, the amount of information is rapidly increasing. Thus, the need to organize such information in an efficient manner is more important than ever. With such dynamic nature, incremental clustering algorithms are always preferred compared to traditional static algorithms. In this paper, an enhanced version of the incremental DBSCAN algorithm is introduced for incrementally building and updating arbitrary shaped clusters in large datasets. The proposed algorithm enhances the incremental clustering process by limiting the search space to partitions rather than the whole dataset which results in significant improvements in the performance compared to relevant incremental clustering algorithms. Experimental results with datasets of different sizes and dimensions show that the proposed algorithm speeds up the incremental clustering process by factor up to 3.2 compared to existing incremental algorithms. (C) 2015 Faculty of Engineering, Alexandria University. Production and hosting by Elsevier B.V.
引用
收藏
页码:1147 / 1154
页数:8
相关论文
共 26 条
[1]  
Achtert E, 2007, LECT NOTES COMPUT SC, V4443, P152
[2]  
Angel Latha Mary S., 2012, Journal of Computer Science, V8, P656
[3]  
Bakr AM, 2012, INT C PATT RECOG, P517
[4]  
Bohm C, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P149
[5]  
Cha S.-H., 2007, INT J MATH MODELS ME, V1, P300, DOI DOI 10.1007/S00167-009-0884-Z
[6]  
Du HaiZhou, 2010, Proceedings 2010 International Conference on Web Information Systems and Mining (WISM 2010), P53, DOI 10.1109/WISM.2010.123
[7]  
Ester M., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P226
[8]  
Ester M., 1998, P 24 VLDB C I COMP S
[9]  
Guha S., 1998, ACM SIGMOD RECORD, DOI [10.1145/276304.276312, DOI 10.1145/276305.276312]
[10]   Efficient phrase-based document indexing for web document clustering [J].
Hammouda, KM ;
Kamel, MS .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (10) :1279-1296