DenForest: Enabling Fast Deletion in Incremental Density-Based Clustering over Sliding Windows

被引:3
作者
Kim, Bogyeong [1 ]
Koo, Kyoseung [1 ]
Enkhbat, Undraa [1 ]
Moon, Bongki [1 ]
机构
[1] Seoul Natl Univ, Seoul, South Korea
来源
PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA (SIGMOD '22) | 2022年
基金
新加坡国家研究基金会;
关键词
DenForest; Density-Based Clustering; Data Stream; Sliding Window; POINT CLOUDS; DBSCAN; ALGORITHMS; SEARCH;
D O I
10.1145/3514221.3517833
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The density-based clustering is utilized for various applications such as hot spot detection or segmentation. To serve those applications in real time, it is desired to update clusters incrementally by capturing only the recent data. The previous incremental density-based clustering algorithms often represent clusters as a graph and suffer serious performance degradation. This is because a costly graph traversal is required to check whether a cluster is still connected whenever a point is removed. In order to address the problem of slow deletion, this paper proposes a novel incremental density-based clustering algorithm called DenForest. By maintaining clusters as a group of spanning trees instead of a graph, DenForest can determine efficiently and accurately whether a cluster is to be split by a point removed from the window in logarithmic time. With extensive evaluations, it is demonstrated that DenForest outperforms the state-of-the-art density-based clustering algorithms significantly and achieves the clustering quality comparable with that of DBSCAN.
引用
收藏
页码:296 / 309
页数:14
相关论文
共 55 条
  • [1] [Anonymous], 2017, PR MACH LEARN RES
  • [2] [Anonymous], 2011, Geolife GPS trajectory dataset-User Guide
  • [3] DECOMPOSABLE SEARCHING PROBLEMS
    BENTLEY, JL
    [J]. INFORMATION PROCESSING LETTERS, 1979, 8 (05) : 244 - 251
  • [4] MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING
    BENTLEY, JL
    [J]. COMMUNICATIONS OF THE ACM, 1975, 18 (09) : 509 - 517
  • [5] Density-Based Clustering over an Evolving Data Stream with Noise
    Cao, Feng
    Ester, Martin
    Qian, Weining
    Zhou, Aoying
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 328 - +
  • [6] An Empirical Comparison of Stream Clustering Algorithms
    Carnein, Matthias
    Assenmacher, Dennis
    Trautmann, Heike
    [J]. ACM INTERNATIONAL CONFERENCE ON COMPUTING FRONTIERS 2017, 2017, : 361 - 366
  • [7] Robust path-based spectral clustering
    Chang, Hong
    Yeung, Dit-Yan
    [J]. PATTERN RECOGNITION, 2008, 41 (01) : 191 - 203
  • [8] Chen YX, 2007, KDD-2007 PROCEEDINGS OF THE THIRTEENTH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P133
  • [9] ALGORITHMS FOR UPDATING MINIMAL SPANNING TREES
    CHIN, F
    HOUCK, D
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1978, 16 (03) : 333 - 344
  • [10] Conway J.H., 1999, GRUNDLEHREN MATH WIS, V290