Adaptive kernel fuzzy clustering for missing data

被引:3
|
作者
Rodrigues, Anny K. G. [1 ]
Ospina, Raydonal [1 ]
Ferreira, Marcelo R. P. [2 ]
机构
[1] Univ Fed Pernambuco, CCEN, Dept Estat, CASTLab, Recife, PE, Brazil
[2] Univ Fed Paraiba, Ctr Ciencias Exatas & Nat, Dept Estat, DataLab, Joao Pessoa, Paraiba, Brazil
来源
PLOS ONE | 2021年 / 16卷 / 11期
关键词
MULTIPLE IMPUTATION; ALGORITHM; FRAMEWORK; VALUES;
D O I
10.1371/journal.pone.0259266
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Many machine learning procedures, including clustering analysis are often affected by missing values. This work aims to propose and evaluate a Kernel Fuzzy C-means clustering algorithm considering the kernelization of the metric with local adaptive distances (VKFCM-K-LP) under three types of strategies to deal with missing data. The first strategy, called Whole Data Strategy (WDS), performs clustering only on the complete part of the dataset, i.e. it discards all instances with missing data. The second approach uses the Partial Distance Strategy (PDS), in which partial distances are computed among all available resources and then re-scaled by the reciprocal of the proportion of observed values. The third technique, called Optimal Completion Strategy (OCS), computes missing values iteratively as auxiliary variables in the optimization of a suitable objective function. The clustering results were evaluated according to different metrics. The best performance of the clustering algorithm was achieved under the PDS and OCS strategies. Under the OCS approach, new datasets were derive and the missing values were estimated dynamically in the optimization process. The results of clustering under the OCS strategy also presented a superior performance when compared to the resulting clusters obtained by applying the VKFCM-K-LP algorithm on a version where missing values are previously imputed by the mean or the median of the observed values.
引用
收藏
页数:33
相关论文
共 50 条
  • [1] Kernel fuzzy clustering methods based on local adaptive distances
    Ferreira, Marcelo R. P.
    de Carvalho, Francisco de A. T.
    2012 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2012,
  • [2] ROUGH FUZZY SUBSPACE CLUSTERING FOR DATA WITH MISSING VALUES
    Siminski, Krzysztof
    COMPUTING AND INFORMATICS, 2014, 33 (01) : 131 - 153
  • [3] Kernel-Based Fuzzy Clustering of Interval Data
    Pimentel, Bruno A.
    da Costa, Anderson F. B. F.
    de Souza, Renata M. C. R.
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 497 - 501
  • [4] Mercer Kernel Based Fuzzy Clustering Self-Adaptive Algorithm
    李侃
    刘玉树
    Journal of Beijing Institute of Technology(English Edition), 2004, (04) : 351 - 354
  • [5] Different Approaches for Missing Data Handling in Fuzzy Clustering: A Review
    Goel, Sonia
    Tushir, Meena
    RECENT ADVANCES IN ELECTRICAL & ELECTRONIC ENGINEERING, 2020, 13 (06) : 841 - 854
  • [6] Fuzzy Clustering Algorithm of Kernel for Gene Expression Data Analysis
    Liu, Wenyuan
    Zhang, Bin
    2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2009, : 553 - 556
  • [7] Fuzzy clustering algorithm of kernel for gene expression data analysis
    Chen, Zhiru
    Hong, Wenxue
    Wang, Changwu
    ICIC Express Letters, 2009, 3 (04): : 1435 - 1440
  • [8] A Collaborative Kernel Fuzzy Clustering
    Gao Cui-Fang
    Wu Xiao-Jun
    2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 2, 2011, : 374 - 377
  • [9] Robust kernel fuzzy clustering
    Du, WW
    Inoue, K
    Urahama, K
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 1, PROCEEDINGS, 2005, 3613 : 454 - 461
  • [10] Multiple Kernel Fuzzy Clustering
    Huang, Hsin-Chien
    Chuang, Yung-Yu
    Chen, Chu-Song
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2012, 20 (01) : 120 - 134