Generalized density attractor clustering for incomplete data

被引:1
|
作者
Leibrandt, Richard [1 ]
Guennemann, Stephan [1 ]
机构
[1] Tech Univ Munich, Dept Informat, Data Analyt & Machine Learning, Boltzmannstr 3, D-85748 Garching, Germany
关键词
Clustering; Missing values; Incomplete datasets; Kernel density estimation; MEAN SHIFT; IMPUTATION;
D O I
10.1007/s10618-022-00904-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mean shift is a popular and powerful clustering method for implementing density attractor clustering (DAC). However, DAC is underdeveloped in terms of modeling definitions and methods for incomplete data. Due to DAC's importance, solving this common issue is crucial. This work makes DAC more versatile by making it applicable to incomplete data: First, using formal modeling definitions, we propose a unifying framework for DAC. Second, we propose new methods that implement the definitions and perform DAC for incomplete data more efficiently and stably than others. We discuss and compare our methods and the closest competitor using theoretical analyses. We quantify the performance of our methods using synthetic datasets with known structures and real-life business data for three missing value types. Finally, we analyze Stack Overflow's 2021 survey to extract clusters of programmers from India and the USA. The experiments verify our methods' superiority to six alternatives. Code, Data:
引用
收藏
页码:970 / 1009
页数:40
相关论文
共 50 条
  • [1] Generalized density attractor clustering for incomplete data
    Richard Leibrandt
    Stephan Günnemann
    Data Mining and Knowledge Discovery, 2023, 37 : 970 - 1009
  • [2] Attractor Density Clustering
    Carroll, T. L.
    Byers, J. M.
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON APPLICATIONS IN NONLINEAR DYNAMICS (ICAND 2016), 2017, 6 : 139 - 149
  • [3] Effective Density-Based Clustering Algorithms for Incomplete Data
    Zhonghao Xue
    Hongzhi Wang
    Big Data Mining and Analytics, 2021, 4 (03) : 183 - 194
  • [4] Effective Density-Based Clustering Algorithms for Incomplete Data
    Xue, Zhonghao
    Wang, Hongzhi
    BIG DATA MINING AND ANALYTICS, 2021, 4 (03) : 183 - 194
  • [5] GENERALIZED DENSITY CLUSTERING
    Rinaldo, Alessandro
    Wasserman, Larry
    ANNALS OF STATISTICS, 2010, 38 (05): : 2678 - 2722
  • [6] Relational data clustering with incomplete data
    Hathaway, RJ
    Overstreet, DD
    Murphy, TE
    Bezdek, JC
    APPLICATIONS AND SCIENCE OF COMPUTATIONAL INTELLIGENCE IV, 2001, 4390 : 273 - 280
  • [7] A Method of Incomplete Data Three-Way Clustering Based on Density Peaks
    Yang, Lin
    Hou, Kaiyan
    6TH INTERNATIONAL CONFERENCE ON COMPUTER-AIDED DESIGN, MANUFACTURING, MODELING AND SIMULATION (CDMMS 2018), 2018, 1967
  • [8] A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning
    Yang Y.
    Chen H.
    Wu H.
    PeerJ Computer Science, 2023, 9
  • [9] A generalized fuzzy clustering framework for incomplete data by integrating feature weighted and kernel learning
    Yang, Ying
    Chen, Haoyu
    Wu, Haoshen
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [10] On Fuzzy Clustering for Incomplete Spherical Data and for Incomplete Multivariate Categorical Data
    Kanzawa, Yuchi
    2018 JOINT 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS (SCIS) AND 19TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (ISIS), 2018, : 638 - 643