Multi-step density-based clustering

被引:0
|
作者
Stefan Brecheisen
Hans-Peter Kriegel
Martin Pfeifle
机构
[1] University of Munich,Institute for Informatics
来源
关键词
Approximated clustering; Complex objects; Data mining; Density-based clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many areas, complex distance measures are first choice but also simpler distance functions are available which can be computed much more efficiently. In this paper, we will demonstrate how the paradigm of multi-step query processing which relies on exact as well as on lower-bounding approximated distance functions can be integrated into the two density-based clustering algorithms DBSCAN and OPTICS resulting in a considerable efficiency boost. Our approach tries to confine itself to ɛ-range queries on the simple distance functions and carries out complex distance computations only at that stage of the clustering algorithm where they are compulsory to compute the correct clustering result. Furthermore, we will show how our approach can be used for approximated clustering allowing the user to find an individual trade-off between quality and efficiency. In order to assess the quality of the resulting clusterings, we introduce suitable quality measures which can be used generally for evaluating the quality of approximated partitioning and hierarchical clusterings. In a broad experimental evaluation based on real-world test data sets, we demonstrate that our approach accelerates the generation of exact density-based clusterings by more than one order of magnitude. Furthermore, we show that our approximated clustering approach results in high quality clusterings where the desired quality is scalable with respect to (w.r.t.) the overall number of exact distance computations.
引用
收藏
页码:284 / 308
页数:24
相关论文
共 50 条
  • [21] Fast Multi-Image Matching via Density-Based Clustering
    Tron, Roberto
    Zhou, Xiaowei
    Esteves, Carlos
    Daniilidis, Kostas
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4077 - 4086
  • [22] DENCAST: distributed density-based clustering for multi-target regression
    Roberto Corizzo
    Gianvito Pio
    Michelangelo Ceci
    Donato Malerba
    Journal of Big Data, 6
  • [23] An improved method for density-based clustering
    Jin, Hong
    Wang, Shuliang
    Zhou, Qian
    Li, Ying
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2014, 6 (04) : 347 - 368
  • [24] FULLY ADAPTIVE DENSITY-BASED CLUSTERING
    Steinwart, Ingo
    ANNALS OF STATISTICS, 2015, 43 (05): : 2132 - 2167
  • [25] Anytime parallel density-based clustering
    Mai, Son T.
    Assent, Ira
    Jacobsen, Jon
    Dieu, Martin Storgaard
    DATA MINING AND KNOWLEDGE DISCOVERY, 2018, 32 (04) : 1121 - 1176
  • [26] Fast density-based clustering algorithm
    Zhou, Shuigeng
    Zhou, Aoying
    Cao, Jing
    Hu, Yunfa
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2000, 37 (11): : 1287 - 1292
  • [27] Density-based clustering with differential privacy
    Wu, Fuyu
    Du, Mingjing
    Zhi, Qiang
    INFORMATION SCIENCES, 2024, 681
  • [28] The Framework of Relative Density-Based Clustering
    Cui, Zelin
    Shen, Hong
    PARALLEL ARCHITECTURE, ALGORITHM AND PROGRAMMING, PAAP 2017, 2017, 729 : 343 - 352
  • [29] A varied density-based clustering algorithm
    Fahim, Ahmed
    JOURNAL OF COMPUTATIONAL SCIENCE, 2023, 66
  • [30] Feature Selection for Density-Based Clustering
    Ling, Yun
    Ye, Chongyi
    2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT UBIQUITOUS COMPUTING AND EDUCATION, 2009, : 226 - 229