An Improved DBSCAN Algorithm Using Local Parameters

被引:3
作者
Diao, Kejing [1 ]
Liang, Yongquan [1 ,2 ,3 ]
Fan, Jiancong [1 ,3 ]
机构
[1] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Peoples R China
[2] Shandong Univ Sci & Technol, Prov Key Lab Informat Technol Wisdom Min Shandong, Qingdao, Peoples R China
[3] Shandong Univ Sci & Technol, Prov Expt Teaching Demonstrat Ctr Comp, Qingdao, Peoples R China
来源
ARTIFICIAL INTELLIGENCE (ICAI 2018) | 2018年 / 888卷
关键词
Clustering; Unbalanced data; Local parameters;
D O I
10.1007/978-981-13-2122-1_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Density-Based Spatial Clustering of Applications with Noise (DBSCAN), as one of the classic density-based clustering algorithms, has the advantage of identifying clusters with different shapes, and it has been widely used in clustering analysis. Due to the DBSCAN algorithm using globally unique parameters. and MinPts, the correct number of classes can not be obtained when clustering the unbalanced data, consequently, the clustering effect is not satisfactory. To solve this problem, this paper proposes a clustering algorithm LP-DBSCAN which uses local parameters for unbalanced data. The algorithm divides the data set into multiple data regions by DPC algorithm. And the size and shape of each data region depends on the density characteristics of the sample. Then for each data region, set the appropriate parameters for local clustering, and finally merge the data regions. The algorithm is simple and easy to implement. The experimental results show that this algorithm can solve the problems of DBSCAN algorithm and can deal with arbitrary shape data and unbalanced data. Especially in dealing with unbalanced data, the clustering effect is obviously better than other algorithms.
引用
收藏
页码:3 / 12
页数:10
相关论文
共 24 条
[1]  
Agarwal S., 2014, P 2013 INT C MACHINE, P203
[2]  
[Anonymous], 1996, INT C KNOWL DISC DAT
[3]  
[Anonymous], ACM SIGMOD REC
[4]  
Bi-Ru Dai, 2012, 2012 IEEE 5th International Conference on Cloud Computing (CLOUD), P59, DOI 10.1109/CLOUD.2012.42
[5]  
Cacciari M., 2008, J HIGH ENERGY PHYS, V04, P403
[6]   A comparative study of efficient initialization methods for the k-means clustering algorithm [J].
Celebi, M. Emre ;
Kingravi, Hassan A. ;
Vela, Patricio A. .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (01) :200-210
[7]  
Chu X., 2013, COMPUT SYST APPL, V22, P92
[8]   Clustering by passing messages between data points [J].
Frey, Brendan J. ;
Dueck, Delbert .
SCIENCE, 2007, 315 (5814) :972-976
[9]   Cure: An efficient clustering algorithm for large databases [J].
Guha, S ;
Rastogi, R ;
Shim, K .
INFORMATION SYSTEMS, 2001, 26 (01) :35-58
[10]   Algorithm to determine ε-distance parameter in density based clustering [J].
Jahirabadkar, Sunita ;
Kulkarni, Parag .
EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (06) :2939-2946