Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification

被引:143
|
作者
Sun, Lin [1 ,3 ,4 ]
Wang, Tianxiang [1 ]
Ding, Weiping [2 ]
Xu, Jiucheng [1 ,4 ]
Lin, Yaojin [3 ]
机构
[1] Henan Normal Univ, Coll Comp & Informat Engn, Xinxiang 453007, Henan, Peoples R China
[2] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
[3] Minnan Normal Univ, Key Lab Data Sci & Intelligence Applicat, Zhangzhou 363000, Peoples R China
[4] Key Lab Artificial Intelligence & Personalized Le, Xinxiang 453007, Henan, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Neighborhood rough sets; Fisher Score; Multilabel classification; LABEL FEATURE-SELECTION; UNCERTAINTY MEASURES; INFORMATION;
D O I
10.1016/j.ins.2021.08.032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, feature selection for multilabel classification has attracted attention in machine learning and data mining. However, some feature selection methods ignore the correlations among labels, resulting in low performance, and most of them face challenges in determining an appropriate neighborhood radius for neighborhood systems and suffer from expensive time cost. To overcome the issues, we propose a novel feature selection method using Fisher score and multilabel neighborhood rough sets (MNRS) in multilabel neighborhood decision systems. First, to identify the correlations between labels under a binary distribution, two types of new mutual information between labels are considered, and their balance coefficients are defined. By enhancing strong correlations and weakening weak correlations between labels, a mutual information-based Fisher score model with a second-order correlation between labels is designed to fit multilabel data. Second, to address the problem of automatically choosing a neighborhood radius, a subset of hetero-geneous and homogeneous samples is employed to develop a new classification margin as a neighborhood radius, and some concepts of neighborhood, neighborhood class, and upper and lower approximations are formulated for multilabel neighborhood decision systems. The weight and dependency degree are presented to effectively measure the uncertainty of samples in multilabel neighborhood decision systems. Thus, we further present a new classification margin-based MNRS model. Finally, a filter-wrapper preprocessing algorithm for feature selection using the improved Fisher score model is proposed to decrease the spatiotemporal complexity of multilabel data, and a heuristic feature selection algorithm is designed for improve classification performance on multilabel datasets. Experimental results on thirteen multilabel datasets show that the proposed algorithm is effective in selecting significant features, demonstrating its excellent classification ability in multilabel datasets. (c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:887 / 912
页数:26
相关论文
共 50 条
  • [21] Feature selection for blind image steganalysis using neighborhood rough sets
    Chen, Yingyue
    Chen, Yumin
    Yin, Aimin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 37 (03) : 3709 - 3720
  • [22] Learning Instance-Level Label Correlation Distribution for Multilabel Classification With Fuzzy Rough Sets
    Che, Xiaoya
    Chen, Degang
    Mi, Jusheng
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (08) : 2871 - 2884
  • [23] Distributed Selection of Continuous Features in Multilabel Classification Using Mutual Information
    Gonzalez-Lopez, Jorge
    Ventura, Sebastian
    Cano, Alberto
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (07) : 2280 - 2293
  • [24] A Multimodal Multiobjective Evolutionary Algorithm for Filter Feature Selection in Multilabel Classification
    Hancer E.
    Xue B.
    Zhang M.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (09): : 4428 - 4442
  • [25] Fuzzy Neighborhood-Based Manifold Learning and Feature Weight Matrix for Multilabel Feature Selection
    Sun, Lin
    Zhang, Qifeng
    Ding, Weiping
    Xu, Jiucheng
    KNOWLEDGE-BASED SYSTEMS, 2024, 299
  • [26] Feature Selection via Label Enhancement and Weighted Neighborhood Mutual Information for Multilabel Data
    Sun, Lin
    Guo, Jiaqi
    Wu, Xuejiao
    Xu, Jiucheng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT II, ICIC 2024, 2024, 14876 : 470 - 480
  • [27] Granule-specific feature selection for continuous data classification using neighborhood rough sets
    Sewwandi, Mahawaga Arachchige Nayomi Dulanjala
    Li, Yuefeng
    Zhang, Jinglan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [28] Multilabel Feature Selection With Constrained Latent Structure Shared Term
    Gao, Wanfu
    Li, Yonghao
    Hu, Liang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (03) : 1253 - 1262
  • [29] Multilabel Feature Selection: A Local Causal Structure Learning Approach
    Yu, Kui
    Cai, Mingzhu
    Wu, Xingyu
    Liu, Lin
    Li, Jiuyong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (06) : 3044 - 3057
  • [30] Gene selection for tumor classification using neighborhood rough sets and entropy measures
    Chen, Yumin
    Zhang, Zunjun
    Zheng, Jianzhong
    Ma, Ying
    Xue, Yu
    JOURNAL OF BIOMEDICAL INFORMATICS, 2017, 67 : 59 - 68