Parameter selection algorithm of DBSCAN based on K-means two classification algorithm

被引：9

作者：

Chen, Shouhong ^{[1
,2
]}

Liu, Xinyu ^{[1
]}

Ma, Jun ^{[1
]}

Zhao, Shuang ^{[1
]}

Hou, Xingna ^{[1
]}

机构：

[1] Guilin Univ Elect Technol, Guilin 541000, Peoples R China

[2] Jiangsu Univ, Zhenjiang 212013, Jiangsu, Peoples R China

来源：

JOURNAL OF ENGINEERING-JOE | 2019年 / 2019卷 / 23期

关键词：

data analysis; unsupervised learning; pattern clustering; pattern classification; human factors; start critical value clustering; DBSCAN algorithm; neighbourhood radius; experimental data analysis show; minimum point number; maximum distance value; cluster centre-to-centre distance; data points; clustering centres; noise density clustering algorithm; classification algorithm; parameter selection algorithm;

D O I：

10.1049/joe.2018.9082

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Clustering algorithm is one of the most important algorithms in unsupervised learning. For density-based spatial clustering of applications with noise (DBSCAN) density clustering algorithm, the selection of neighborhood radius and minimum number is the key to get the best clustering results. Aiming at the problems of traditional DBSCAN algorithm, such as the neighborhood radius and the minimum number of points, this article puts forward two classifications based on K-means algorithm, and gets two clustering centers. Where calculated between two data points and the cluster center-to -center distance, clustering, distance, statistics in a distance of data points within the scope of the search, the number of data points corresponding to the maximum distance value, and thus the parameters for the DBSCAN algorithm to estimate and selection of initial radius of neighborhood with the minimum number of clustering start critical value. When the parameters are iterated and optimized continuously, the data are divided into clusters, and the most suitable neighborhood radius and the minimum point number are obtained. The experimental data analysis show that the improved algorithm reduces the human factors in the traditional algorithm and improves the efficiency, so as to get the accurate clustering results.

引用

页码：8676 / 8679

页数：4

共 9 条

[1] [Anonymous], Em: TKDD, DOI [DOI 10.1109/ICDE.2005.34, 10.1109/ICDE.2005.34]
[2] [胡瑞飞 HU Ruifei], 2006, [四川大学学报. 工程科学版, Journal of Sichuan University. Engineering Science Edition], V38, P156
[3] Lei X.F., 2007, J SOFTW, V7, P1683
[4] Set Matching Measures for External Cluster Validity
Rezaei, Mohammad
Franti, Pasi
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (08) : 2173 - 2186
[5] Wang Yong, 2014, Journal of Computer Applications, V34, P1331
[6] [王兆丰 Wang Zhaofeng], 2017, [计算机工程与应用, Computer Engineering and Application], V53, P80
[7] Xiang P., 2011, J SW U NATL, V37, P112
[8] GRAPH-THEORETICAL METHODS FOR DETECTING AND DESCRIBING GESTALT CLUSTERS
ZAHN, CT
[J]. IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (01) : 68 - &
[9] [张涛 Zhang Tao], 2018, [计算机应用研究, Application Research of Computers], V35, P3564

← 1 →