Sparse feature selection via local feature and high-order label correlation

被引:9
作者
Sun, Lin [1 ,2 ]
Ma, Yuxuan [2 ]
Ding, Weiping [3 ]
Xu, Jiucheng [2 ]
机构
[1] Tianjin Univ Sci & Technol, Coll Artificial Intelligence, Tianjin 300457, Peoples R China
[2] Henan Normal Univ, Coll Comp & Informat Engn, Xinxiang 453007, Peoples R China
[3] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Loss function; Manifold learning; High-order label correlation; NEIGHBORHOOD ROUGH SETS; SCORE;
D O I
10.1007/s10489-023-05136-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, some existing feature selection approaches neglect the correlation among labels, and almost manifold-based multilabel learning models do not considered the relationship between features and labels, which result in reducing classification effect. To overcome the shortcomings, our work develops a fresh sparse feature selection approach via local feature and high-order label correlation. First, the sparse processing is performed by applying the l2,1 norm on the weight coefficient matrix, and a loss function between the sample and label matrices can be established to explore the potential relationship between features and labels. The designed loss function can be directly sparse by the weight coefficient matrix, which calculates this weight of each feature, and then some features with higher scores are selected. Second, the combination of manifold learning and Laplacian score is used to deal with local features to make full use of local feature correlation. The manifold regularization for the embedded feature selection can guide exploring potential real labels and selecting different features for individual labels. Finally, to safeguard the rank of the high-order label matrix from being damaged, a self-representation strategy is employed. Then the high-order label weight matrix and the label error term are defined to enhance the accuracy of label self-representation and correct the deviation between the self-representation scheme and the real label, and the Frobenius and the l2 norm regularization can avoid those trivial solutions and overfitting issues. A representation function of high-order label correlation is proposed based on the self-representation strategy, which can accurately represent the potential information between high-order labels. Thus, those local features are scored by the Laplacian score for different features to select an optimal feature subset with higher scores. Experiments on 16 multilabel datasets illustrate that our constructed algorithm will be efficient in obtaining important feature set and implementing powerful classification efficacy on multilabel classification.
引用
收藏
页码:565 / 591
页数:27
相关论文
共 65 条
  • [1] Automatic Feature Selection via Weighted Kernels and Regularization
    Allen, Genevera I.
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2013, 22 (02) : 284 - 299
  • [2] Cai X., 2013, P 23 INT JOINT C ART, P1240
  • [3] [曹栋涛 Cao Dongtao], 2023, [计算机科学, Computer Science], V50, P37
  • [4] Multi-surrogate assisted multi-objective evolutionary algorithms for feature selection in regression and classification problems with time series data
    Espinosa, Raquel
    Jimenez, Fernando
    Palma, Jose
    [J]. INFORMATION SCIENCES, 2023, 622 : 1064 - 1091
  • [5] Multi-label feature selection based on label correlations and feature redundancy
    Fan, Yuling
    Chen, Baihua
    Huang, Weiqin
    Liu, Jinghua
    Weng, Wei
    Lan, Weiyao
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 241
  • [6] A comparison of alternative tests of significance for the problem of m rankings
    Friedman, M
    [J]. ANNALS OF MATHEMATICAL STATISTICS, 1940, 11 : 86 - 92
  • [7] Hao X., 2022, INTELL SYST APPL, V14
  • [8] An efficient Pareto-based feature selection algorithm for multi-label classification
    Hashemi, Amin
    Dowlatshahi, Mohammad Bagher
    Nezamabadi-pour, Hossein
    [J]. INFORMATION SCIENCES, 2021, 581 : 428 - 447
  • [9] MFS-MCDM: Multi-label feature selection using multi-criteria decision making
    Hashemi, Amin
    Dowlatshahi, Mohammad Bagher
    Nezamabadi-pour, Hossein
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 206
  • [10] Hashmi A., 2020, PROC 3 INT C LEARN R, P1, DOI [10.1109/CITISIA50690.2020.9371784, DOI 10.1109/CSICC49403.2020.9050104]