Density-based clustering with non-continuous data

被引:2
|
作者
Azzalini, Adelchi [1 ]
Menardi, Giovanna [1 ]
机构
[1] Univ Padua, Dipartimento Sci Stat, Padua, Italy
关键词
Density estimation; Mixed variables; Modal clustering; Model-based clustering; Multidimensional scaling; DISCRIMINANT-ANALYSIS; MODEL; TREE;
D O I
10.1007/s00180-016-0644-8
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Density-based clustering relies on the idea of associating groups with regions of the sample space characterized by high density of the probability distribution underlying the observations. While this approach to cluster analysis exhibits some desirable properties, its use is necessarily limited to continuous data only. The present contribution proposes a simple but working way to circumvent this problem, based on the identification of continuous components underlying the non-continuous variables. The basic idea is explored in a number of variants applied to simulated data, confirming the practical effectiveness of the technique and leading to recommendations for its practical usage. Some illustrations using real data are also presented.
引用
收藏
页码:771 / 798
页数:28
相关论文
共 50 条
  • [11] A Population Background for Nonparametric Density-Based Clustering
    Chacon, Jose E.
    STATISTICAL SCIENCE, 2015, 30 (04) : 518 - 532
  • [12] Density-based Silhouette diagnostics for clustering methods
    Menardi, Giovanna
    STATISTICS AND COMPUTING, 2011, 21 (03) : 295 - 308
  • [13] On the Use of Density-Based Algorithms for the Analysis of Solute Clustering in Atom Probe Tomography Data
    Marquis, Emmanuelle A.
    Araullo-Peters, Vicente
    Dong, Yan
    Etienne, Auriane
    Fedotova, Svetlana
    Fujii, Katsuhiko
    Fukuya, Koji
    Kuleshova, Evgenia
    Lopez, Anabelle
    London, Andrew
    Lozano-Perez, Sergio
    Nagai, Yasuyoshi
    Nishida, Kenji
    Radiguet, Bertrand
    Schreiber, Daniel
    Soneda, Naoki
    Thuvander, Mattias
    Toyama, Takeshi
    Sefta, Faiza
    Chou, Peter
    PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON ENVIRONMENTAL DEGRADATION OF MATERIALS IN NUCLEAR POWER SYSTEMS - WATER REACTORS, VOL 2, 2018, : 881 - 897
  • [14] Clustering Structure Analysis in Time-Series Data With Density-Based Clusterability Measure
    Jokinen, Juho
    Raty, Tomi
    Lintonen, Timo
    IEEE-CAA JOURNAL OF AUTOMATICA SINICA, 2019, 6 (06) : 1332 - 1343
  • [15] Density-based multiscale data condensation
    Mitra, P
    Murthy, CA
    Pal, SK
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (06) : 734 - 747
  • [16] Knn density-based clustering for high dimensional multispectral images
    Tran, TN
    Wehrens, R
    Buydens, LMC
    2ND GRSS/ISPRS JOINT WORKSHOP ON REMOTE SENSING AND DATA FUSION OVER URBAN AREAS, 2003, : 147 - 151
  • [17] Density-based clustering localization algorithm for wireless sensor networks
    Wang, Yong
    Hu, Liang-Liang
    Yuan, Chao-Yan
    Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China, 2013, 42 (03): : 406 - 409
  • [18] Derivation of characteristic physioclimatic regions through density-based spatial clustering of high-dimensional data
    Lehner, Sebastian
    Enigl, Katharina
    Schloegl, Matthias
    ENVIRONMENTAL MODELLING & SOFTWARE, 2025, 186
  • [19] DCSNE: Density-based Clustering using Graph Shared Neighbors and Entropy
    Maheshwari, Rashmi
    Mohanty, Sraban Kumar
    Mishra, Amaresh Chandra
    PATTERN RECOGNITION, 2023, 137
  • [20] Detecting crash hotspots using grid and density-based spatial clustering
    Khosrowshahi, Amin Ganjali
    Aghayan, Iman
    Kunt, Mehmet Metin
    Choupani, Abdoul-Ahad
    PROCEEDINGS OF THE INSTITUTION OF CIVIL ENGINEERS-TRANSPORT, 2021, 176 (04) : 200 - 212