Fuzzy information decomposition incorporated and weighted Relief-F feature selection: When imbalanced data meet incompletion

被引:18
作者
Dou, Jun [1 ]
Song, Yan [1 ]
Wei, Guoliang [2 ]
Zhang, Yameng [1 ]
机构
[1] Univ Shanghai Sci & Technol, Dept Control Sci & Engn, Shanghai 200093, Peoples R China
[2] Univ Shanghai Sci & Technol, Coll Sci, Shanghai Key Lab Modern Opt Syst, Shanghai 200093, Peoples R China
关键词
Imbalanced class; Incomplete data; Feature selection; Weighted Relief-F; Fuzzy information decomposition; MISSING DATA IMPUTATION; MACHINE; SMOTE;
D O I
10.1016/j.ins.2021.10.057
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Data classification is an important computer task in data analysis, which suffers seriously unknown features, imbalanced class, and incomplete data. However, despite their vital yet practical significance, few results have been made on such three distinct issues. To address this problem, we propose a novel feature selection method for the data subject to incom-plete data and imbalanced class, namely, improved fuzzy information decomposition (IFID) incorporated and weighted Relief-F (WRelief-F) feature selection. The main idea of the pro-posed feature selection method is threefold. (1) The proposed IFID algorithm can deal with the imbalanced class and incomplete data at the same time. (2) In IFID, a new membership function is provided to reflect the influence of the observed data on the missing values appropriately. Based on this establishment, a more delicate information decomposition is adopted to make a better recovery than the traditional FID. (3) After using IFID, WRelief-F is put forward to take the relationship of the target instance to inter-class instances and the intra-class instances into consideration in a proper manner. Finally, experiments on the seven public data sets are utilized to show the effectiveness and uni-versal applicability of the proposed feature selection algorithm. (c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:417 / 432
页数:16
相关论文
共 45 条
  • [41] Yoon J, 2018, PR MACH LEARN RES, V80
  • [42] FUZZY SETS
    ZADEH, LA
    [J]. INFORMATION AND CONTROL, 1965, 8 (03): : 338 - &
  • [43] Missing value imputation in multivariate time series with end-to-end generative adversarial networks
    Zhang, Ying
    Zhou, Baohang
    Cai, Xiangrui
    Guo, Wenya
    Ding, Xiaoke
    Yuan, Xiaojie
    [J]. INFORMATION SCIENCES, 2021, 551 : 67 - 82
  • [44] Efficient Utilization of Missing Data in Cost-Sensitive Learning
    Zhu, Xiaofeng
    Yang, Jianye
    Zhang, Chengyuan
    Zhang, Shichao
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (06) : 2425 - 2436
  • [45] Weighted extreme learning machine for imbalance learning
    Zong, Weiwei
    Huang, Guang-Bin
    Chen, Yiqiang
    [J]. NEUROCOMPUTING, 2013, 101 : 229 - 242