Information-theoretic partially labeled heterogeneous feature selection based on neighborhood rough sets

被引:14
|
作者
Zhang, Hongying [1 ]
Sun, Qianqian [1 ]
Dong, Kezhen [1 ]
机构
[1] Xi An Jiao Tong Univ, Sch Math & Stat, Xian 710049, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Monotonic entropy; Partially labeled heterogeneous data; ATTRIBUTE REDUCTION;
D O I
10.1016/j.ijar.2022.12.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of partially labeled heterogeneous feature selection (i.e., some samples, which own numerical and categorical features, have no labels). Existing solutions typically adopt linear correlations between features. In this paper, three different monotonic uncertainty measures are defined on equivalence classes and neighborhood classes to study the partially labeled heterogeneous feature selection by exploring the nonlinear correlations. First, consistent entropy and monotonic neighborhood entropy, based on classical rough set theory and neighborhood rough set theory, are proposed to construct a uniform measure for feature selection in heterogeneous datasets. Furthermore, a maximal neighborhood entropy strategy is developed by considering the inconsistency of neighborhood classes described by the features and partial labels. Finally, two feature selection algorithms are presented by three novel monotonic uncertainty measures. The comparative experiments demonstrate the effectiveness and superiority of the newly proposed feature selection measures.(c) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:200 / 217
页数:18
相关论文
共 50 条
  • [1] Semi-supervised partially labeled heterogeneous feature selection based on information-theoretic three-way decision model
    Sun, Qianqian
    Zhang, Hongying
    Ding, Weiping
    APPLIED SOFT COMPUTING, 2025, 174
  • [2] Feature Selection Based on Confirmation-Theoretic Rough Sets
    Zhou, Bing
    Yao, Yiyu
    ROUGH SETS AND CURRENT TRENDS IN SOFT COMPUTING, RSCTC 2014, 2014, 8536 : 181 - 188
  • [3] Hypergraph based information-theoretic feature selection
    Zhang, Zhihong
    Hancock, Edwin R.
    PATTERN RECOGNITION LETTERS, 2012, 33 (15) : 1991 - 1999
  • [4] Feature selection for imbalanced data based on neighborhood rough sets
    Chen, Hongmei
    Li, Tianrui
    Fan, Xin
    Luo, Chuan
    INFORMATION SCIENCES, 2019, 483 : 1 - 20
  • [5] Feature subset selection based on fuzzy neighborhood rough sets
    Wang, Changzhong
    Shao, Mingwen
    He, Qiang
    Qian, Yuhua
    Qi, Yali
    KNOWLEDGE-BASED SYSTEMS, 2016, 111 : 173 - 179
  • [6] Information-theoretic algorithm for feature selection
    Last, M
    Kandel, A
    Maimon, O
    PATTERN RECOGNITION LETTERS, 2001, 22 (6-7) : 799 - 811
  • [7] Feature Subset Selection Based on Variable Precision Neighborhood Rough Sets
    Chen, Yingyue
    Chen, Yumin
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2021, 14 (01) : 572 - 581
  • [8] Feature Selection for Partially Labeled Data Based on Neighborhood Granulation Measures
    Li, Bingyang
    Xiao, Jianmei
    Wang, Xihuai
    IEEE ACCESS, 2019, 7 : 37238 - 37250
  • [9] Feature selection based on neighborhood rough sets and Gini index
    Zhang, Yuchao
    Nie, Bin
    Du, Jianqiang
    Chen, Jiandong
    Du, Yuwen
    Jin, Haike
    Zheng, Xuepeng
    Chen, Xingxin
    Miao, Zhen
    PEERJ, 2023, 11
  • [10] Feature selection based on neighborhood rough sets and Gini index
    Zhang Y.
    Nie B.
    Du J.
    Chen J.
    Du Y.
    Jin H.
    Zheng X.
    Chen X.
    Miao Z.
    PeerJ Computer Science, 2023, 9