A novel feature selection framework for incomplete data

被引:0
|
作者
Guo, Cong [1 ]
Yang, Wei [1 ]
Li, Zheng [1 ]
Liu, Chun [1 ]
机构
[1] Henan Univ, Sch Comp & Informat Engn, Henan Key Lab Big Data Anal & Proc, Henan Engn Lab Spatial Informat Proc, Kaifeng 475004, Peoples R China
关键词
Feature selection; Incomplete data; ReliefF; MATRIX COMPLETION; MISSING VALUES; CLASSIFICATION;
D O I
10.1016/j.chemolab.2024.105193
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature selection on incomplete datasets is a challenging task. To address this challenge, existing methods first employ imputation methods to complete the dataset and then perform feature selection based on the imputed dataset. Since missing value imputation and feature selection are entirely independent, the importance of features cannot be considered during imputation. However, in real-world scenarios or datasets, different features have varying degrees of importance. To this end, we proposed a novel incomplete data feature selection framework that considers feature importance. The framework mainly consists of two alternating iterative stages: M-stage and W-stage. In the M-stage, missing values are imputed based on a given feature importance vector and multiple initial imputation results. In the W-stage, an improved reliefF algorithm is employed to learn the feature importance vector based on the imputed data. In particular, the feature importance output by the W-stage in the current iteration will be used as the input of the M-stage in the next iteration. Experimental results on artificial and real missing datasets demonstrate that the proposed method outperforms other approaches significantly.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] A novel feature selection framework based on grey wolf optimizer for mammogram image analysis
    Sathiyabhama, B.
    Kumar, S. Udhaya
    Jayanthi, J.
    Sathiya, T.
    Ilavarasi, A. K.
    Yuvarajan, V
    Gopikrishna, Konga
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (21) : 14583 - 14602
  • [42] A novel feature selection approach for biomedical data classification
    Peng, Yonghong
    Wu, Zhiqing
    Jiang, Jianmin
    JOURNAL OF BIOMEDICAL INFORMATICS, 2010, 43 (01) : 15 - 23
  • [43] A novel minorization–maximization framework for simultaneous feature selection and clustering of high-dimensional count data
    Nuha Zamzami
    Nizar Bouguila
    Pattern Analysis and Applications, 2023, 26 : 91 - 106
  • [44] A Novel Feature Selection Method for Gene Expression Data Based on Samples Localization
    Sheng, Mingyue
    Du, Wei
    Tian, Yuan
    Liang, Yanchun
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON BIOLOGICAL ENGINEERING AND PHARMACY (BEP 2016), 2016, 3 : 63 - 68
  • [45] An evolutionary parallel multiobjective feature selection framework
    Kiziloz, Hakan Ezgi
    Deniz, Ayca
    COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 159 (159)
  • [46] An Ensemble Feature Selection Framework Integrating Stability
    Zhang, Xiaokang
    Jonassen, Inge
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 2792 - 2798
  • [47] A novel filter feature selection method for paired microarray expression data analysis
    Cao, Zhongbo
    Wang, Yan
    Sun, Ying
    Du, Wei
    Liang, Yanchun
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 12 (04) : 363 - 386
  • [48] A Novel Two-Phase Method for the Classification of Incomplete Data
    Qu, Xiuyun
    Yuan, Bo
    Liu, Wenhuang
    2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT, INNOVATION MANAGEMENT AND INDUSTRIAL ENGINEERING, VOL 3, PROCEEDINGS, 2009, : 452 - 455
  • [49] Multi-Round Random Subspace Feature Selection for Incomplete Gene Expression Data
    Pearson, Will
    Cao Truong Tran
    Zhang, Mengjie
    Xue, Bing
    2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 2544 - 2551
  • [50] A novel hybrid algorithm for feature selection
    Yuefeng Zheng
    Ying Li
    Gang Wang
    Yupeng Chen
    Qian Xu
    Jiahao Fan
    Xueting Cui
    Personal and Ubiquitous Computing, 2018, 22 : 971 - 985