KR-DBSCAN: A density-based clustering algorithm based on reverse nearest neighbor and influence space

被引:35
作者
Hu, Lihua [1 ]
Liu, Hongkai [1 ]
Zhang, Jifu [1 ]
Liu, Aiqin [1 ]
机构
[1] Taiyuan Univ Sci & Technol, Sch Comp Sci & Technol, Taiyuan 030024, Peoples R China
基金
中国国家自然科学基金;
关键词
Density-based clustering; Cluster expansion; Reverse nearest neighborhood; Influence space; Core object;
D O I
10.1016/j.eswa.2021.115763
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Density-based clustering is one of the most commonly used analysis methods in data mining and machine learning, with the advantage of locating non-ball-shaped clusters without specifying the number of clusters in advance. However, it has notable shortcomings, such as an inability to distinguish adjacent clusters of different densities. We propose a density-based clustering algorithm, KR-DBSCAN, which is based on the reverse nearest neighbor and influence space. The core objects are identified according to the reverse nearest neighborhood, and their influence spaces are determined by calculating the k-nearest neighborhood and reverse nearest neighborhood for each data object under the Euclidean distance metric. In particular, a new cluster expansion condition is defined using the reverse nearest neighborhood and its influence space, and when the core objects are within their influence spaces, they are added to the cluster by breadth-first traversal. As a result, adjacent clusters with different densities are effectively distinguished, and the computational load is substantially reduced. Boundary objects and noise objects are identified, also using k-nearest neighbors. KR-DBSCAN is experimentally validated on the UCI dataset and some synthetic datasets.
引用
收藏
页数:8
相关论文
共 30 条
[1]   Cluster analysis of urban ultrafine particles size distributions [J].
Agudelo-Castaneda, Dayana M. ;
Teixeira, Elba C. ;
Braga, Marcel ;
Rolim, Silvia B. A. ;
Silva, Luis F. O. ;
Beddows, David C. S. ;
Harrison, Roy M. ;
Querol, Xavier .
ATMOSPHERIC POLLUTION RESEARCH, 2019, 10 (01) :45-52
[2]  
Brown D, 2019, 2019 IEEE 9TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), P48, DOI 10.1109/CCWC.2019.8666548
[3]   RNN-DBSCAN: A Density-Based Clustering Algorithm Using Reverse Nearest Neighbor Density Estimates [J].
Bryant, Avory ;
Cios, Krzysztof .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2018, 30 (06) :1109-1121
[4]  
Cappozzo A., 2019, ADV DATA ANAL CLASSI, P45
[5]   Enhancing density-based clustering: Parameter reduction and outlier detection [J].
Cassisi, Carmelo ;
Ferro, Alfredo ;
Giugno, Rosalba ;
Pigola, Giuseppe ;
Pulvirenti, Alfredo .
INFORMATION SYSTEMS, 2013, 38 (03) :317-330
[6]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[7]   MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data [J].
He, Yaobin ;
Tan, Haoyu ;
Luo, Wuman ;
Feng, Shengzhong ;
Fan, Jianping .
FRONTIERS OF COMPUTER SCIENCE, 2014, 8 (01) :83-99
[8]   Density-based clustering [J].
Kriegel, Hans-Peter ;
Kroeger, Peer ;
Sander, Joerg ;
Zimek, Arthur .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (03) :231-240
[9]   Hierarchical density-based clustering of uncertain data [J].
Kriegel, HP ;
Pfeifle, M .
Fifth IEEE International Conference on Data Mining, Proceedings, 2005, :689-692
[10]  
Li SY, 2019, INT CONF ACOUST SPEE, P11, DOI [10.1109/ICASSP.2019.8682593, 10.1109/icassp.2019.8682593]