Dynamic Density Based Clustering

被引:23
作者
Gan, Junhao [1 ]
Tao, Yufei [1 ]
机构
[1] Univ Queensland, Brisbane, Qld, Australia
来源
SIGMOD'17: PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA | 2017年
关键词
Approximate DBSCAN; Dynamic Clustering; Algorithms; ALGORITHMS;
D O I
10.1145/3035918.3064050
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dynamic clustering-how to efficiently maintain data clusters along with updates in the underlying dataset-is a difficult topic. This is especially true for density-based clustering, where objects are aggregated based on transitivity of proximity, under which deciding the cluster(s) of an object may require the inspection of numerous other objects. The phenomenon is unfortunate, given the popular usage of this clustering approach in many applications demanding data updates. Motivated by the above, we investigate the algorithmic principles for dynamic clustering by DBSCAN, a successful representative of density-based clustering, and p-approximate DBSCAN, proposed to bring down the computational hardness of the former on static data. Surprisingly, we prove that the rho-approximate version suffers from the very same hardness when the dataset is fully dynamic, namely, when both insertions and deletions are allowed. We also show that this issue goes away as soon as tiny further relaxation is applied, yet still ensuring the same quality-known as the "sandwich guarantee"rho-of p-approximate DBSCAN. Our algorithms guarantee near-constant update processing, and outperform existing approaches by a factor over two orders of magnitude.
引用
收藏
页码:1493 / 1507
页数:15
相关论文
共 23 条
  • [1] Ankerst M, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P49
  • [2] [Anonymous], 2006, Introduction to Data Mining
  • [3] [Anonymous], 2013, THESIS TU EINDHOVEN
  • [4] [Anonymous], 1990, P 1990 ACM SIGMOD IN, DOI DOI 10.1145/93597.98741
  • [5] An optimal algorithm for approximate nearest neighbor searching in fixed dimensions
    Arya, S
    Mount, DM
    Netanyahu, NS
    Silverman, R
    Wu, AY
    [J]. JOURNAL OF THE ACM, 1998, 45 (06) : 891 - 923
  • [6] Density-Based Clustering over an Evolving Data Stream with Noise
    Cao, Feng
    Ester, Martin
    Qian, Weining
    Zhou, Aoying
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 328 - +
  • [7] A Dynamic Data Structure for 3-D Convex Hulls and 2-D Nearest Neighbor Queries
    Chan, Timothy M.
    [J]. JOURNAL OF THE ACM, 2010, 57 (03)
  • [8] New lower bounds for Hopcroft's problem
    Erickson, J
    [J]. DISCRETE & COMPUTATIONAL GEOMETRY, 1996, 16 (04) : 389 - 418
  • [9] Erickson J., 1995, P 7 CAN C COMP GEOM, P85
  • [10] Ester M., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P226