Multi-step density-based clustering

被引:0
|
作者
Stefan Brecheisen
Hans-Peter Kriegel
Martin Pfeifle
机构
[1] University of Munich,Institute for Informatics
来源
关键词
Approximated clustering; Complex objects; Data mining; Density-based clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many areas, complex distance measures are first choice but also simpler distance functions are available which can be computed much more efficiently. In this paper, we will demonstrate how the paradigm of multi-step query processing which relies on exact as well as on lower-bounding approximated distance functions can be integrated into the two density-based clustering algorithms DBSCAN and OPTICS resulting in a considerable efficiency boost. Our approach tries to confine itself to ɛ-range queries on the simple distance functions and carries out complex distance computations only at that stage of the clustering algorithm where they are compulsory to compute the correct clustering result. Furthermore, we will show how our approach can be used for approximated clustering allowing the user to find an individual trade-off between quality and efficiency. In order to assess the quality of the resulting clusterings, we introduce suitable quality measures which can be used generally for evaluating the quality of approximated partitioning and hierarchical clusterings. In a broad experimental evaluation based on real-world test data sets, we demonstrate that our approach accelerates the generation of exact density-based clusterings by more than one order of magnitude. Furthermore, we show that our approximated clustering approach results in high quality clusterings where the desired quality is scalable with respect to (w.r.t.) the overall number of exact distance computations.
引用
收藏
页码:284 / 308
页数:24
相关论文
共 50 条
  • [1] Multi-step density-based clustering
    Brecheisen, S
    Kriegel, HP
    Pfeifle, M
    KNOWLEDGE AND INFORMATION SYSTEMS, 2006, 9 (03) : 284 - 308
  • [2] Density-based clustering
    Campello, Ricardo J. G. B.
    Kroeger, Peer
    Sander, Jorg
    Zimek, Arthur
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2020, 10 (02)
  • [3] Density-based clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Sander, Joerg
    Zimek, Arthur
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (03) : 231 - 240
  • [4] Name Disambiguation Method based on Multi-step Clustering
    Gu, S.
    Xu, X.
    Zhu, J.
    Ji, L.
    7TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT 2016) / THE 6TH INTERNATIONAL CONFERENCE ON SUSTAINABLE ENERGY INFORMATION TECHNOLOGY (SEIT-2016) / AFFILIATED WORKSHOPS, 2016, 83 : 488 - 495
  • [5] Two Step Density-Based Object-Inductive Clustering Algorithm
    Lytvynenko, Volodymyr
    Lurie, Irina
    Krejci, Jan
    Voronenko, Mariia
    Savina, Nataliia
    Taif, Mohamed Ali
    MOMLET&DS-2019: MODERN MACHINE LEARNING TECHNOLOGIES AND DATA SCIENCE, 2019, 2386 : 117 - 135
  • [6] Density-Based Clustering of Polygons
    Joshi, Deepti
    Samal, Ashok K.
    Soh, Leen-Kiat
    2009 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, 2009, : 171 - 178
  • [7] Density-Based Clustering with Constraints
    Lasek, Piotr
    Gryz, Jarek
    COMPUTER SCIENCE AND INFORMATION SYSTEMS, 2019, 16 (02) : 469 - 489
  • [8] Directional density-based clustering
    Saavedra-Nieves, Paula
    Fernandez-Perez, Martin
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2024,
  • [9] Active Density-Based Clustering
    Mai, Son T.
    He, Xiao
    Hubig, Nina
    Plant, Claudia
    Boehm, Christian
    2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2013, : 508 - 517
  • [10] An efficient density-based clustering for multi-dimensional database
    Zhang, Lieliang
    Li, Zhiyang
    Liu, Weijiang
    Qu, Wenyu
    Wu, Yinan
    2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION, CYBERNETICS AND COMPUTATIONAL SOCIAL SYSTEMS (ICCSS), 2017, : 361 - 366