A novel feature selection framework for incomplete data

被引:0
|
作者
Guo, Cong [1 ]
Yang, Wei [1 ]
Li, Zheng [1 ]
Liu, Chun [1 ]
机构
[1] Henan Univ, Sch Comp & Informat Engn, Henan Key Lab Big Data Anal & Proc, Henan Engn Lab Spatial Informat Proc, Kaifeng 475004, Peoples R China
关键词
Feature selection; Incomplete data; ReliefF; MATRIX COMPLETION; MISSING VALUES; CLASSIFICATION;
D O I
10.1016/j.chemolab.2024.105193
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature selection on incomplete datasets is a challenging task. To address this challenge, existing methods first employ imputation methods to complete the dataset and then perform feature selection based on the imputed dataset. Since missing value imputation and feature selection are entirely independent, the importance of features cannot be considered during imputation. However, in real-world scenarios or datasets, different features have varying degrees of importance. To this end, we proposed a novel incomplete data feature selection framework that considers feature importance. The framework mainly consists of two alternating iterative stages: M-stage and W-stage. In the M-stage, missing values are imputed based on a given feature importance vector and multiple initial imputation results. In the W-stage, an improved reliefF algorithm is employed to learn the feature importance vector based on the imputed data. In particular, the feature importance output by the W-stage in the current iteration will be used as the input of the M-stage in the next iteration. Experimental results on artificial and real missing datasets demonstrate that the proposed method outperforms other approaches significantly.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] Incomplete label distribution feature selection based on neighborhood-tolerance discrimination index
    Qian, Wenbin
    Dong, Ping
    Dai, Shiming
    Huang, Jintao
    Wang, Yinglong
    APPLIED SOFT COMPUTING, 2022, 130
  • [32] Incremental unsupervised feature selection for dynamic incomplete multi-view data
    Huang, Yanyong
    Guo, Kejun
    Yi, Xiuwen
    Li, Zhong
    Li, Tianrui
    INFORMATION FUSION, 2023, 96 : 312 - 327
  • [33] Feature Selection: A Data Perspective
    Li, Jundong
    Cheng, Kewei
    Wang, Suhang
    Morstatter, Fred
    Trevino, Robert P.
    Tang, Jiliang
    Liu, Huan
    ACM COMPUTING SURVEYS, 2018, 50 (06)
  • [34] A novel feature selection framework in Chinese term definition extraction
    Pan, Xu
    Gu, Hong-Bin
    Zhao, Zhi-Qmg
    Information Technology Journal, 2012, 11 (01) : 148 - 153
  • [35] A Novel Feature Selection Framework for Automatic Web Page Classification
    J.Alamelu Mangai
    V.Santhosh Kumar
    S.Appavu alias Balamurugan
    International Journal of Automation and Computing, 2012, (04) : 442 - 448
  • [36] Feature Selection and Classification of Big Data Using MapReduce Framework
    Devi, D. Renuka
    Sasikala, S.
    INTELLIGENT COMPUTING, INFORMATION AND CONTROL SYSTEMS, ICICCS 2019, 2020, 1039 : 666 - 673
  • [37] A novel filter feature selection algorithm based on relief
    Xueting Cui
    Ying Li
    Jiahao Fan
    Tan Wang
    Applied Intelligence, 2022, 52 : 5063 - 5081
  • [38] A novel filter feature selection algorithm based on relief
    Cui, Xueting
    Li, Ying
    Fan, Jiahao
    Wang, Tan
    APPLIED INTELLIGENCE, 2022, 52 (05) : 5063 - 5081
  • [39] A filter feature selection for high-dimensional data
    Janane, Fatima Zahra
    Ouaderhman, Tayeb
    Chamlal, Hasna
    JOURNAL OF ALGORITHMS & COMPUTATIONAL TECHNOLOGY, 2023, 17
  • [40] A Novel Feature Selection Framework for Automatic Web Page Classification
    Mangai, J. Alamelu
    Kumar, V. Santhosh
    Balamurugan, S. Appavu Alias
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2012, 9 (04) : 442 - 448