Feature selection using symmetric uncertainty and hybrid optimization for high-dimensional data

被引:5
作者
Sun, Lin [1 ,3 ]
Sun, Shujing [1 ]
Ding, Weiping [2 ]
Huang, Xinyue [1 ]
Fan, Peiyi [1 ]
Li, Kunyu [1 ]
Chen, Leqi [1 ]
机构
[1] Henan Normal Univ, Coll Comp & Informat Engn, Xinxiang 453007, Peoples R China
[2] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
[3] Tianjin Univ Sci & Technol, Coll Artificial Intelligence, Tianjin 300457, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Symmetric uncertainty; Feature clustering; Hybrid optimization; High-dimensional data; INFORMATION; ALGORITHM;
D O I
10.1007/s13042-023-01897-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, when handling high-dimensional data, it has become extremely difficult to search this optimal subset of selected features due to the restriction of reducing the exponential increase of the search procedure, and most of those feature selection models neglect the interactions of features or feature and decision class. This paper develops a novel feature selection approach using symmetric uncertainty and hybrid optimization for high-dimensional data (FSUHO) for high-dimensional data. First, to fully reflect the interaction relationship of features or feature and decision class, the F-relevance between features and the C-correlation between feature and decision class based on the symmetric uncertainty are constructed to remove those redundant features. Then, a strong correlation threshold is improved based on the C-correlation and random coefficient to prevent the removal of the effective features in this first stage. Second, to decrease this expensive computational consumption, one criterion for judging a weakly correlated feature is designed to sort all features, and another criterion is developed to select the class center. The similarity between features and class centers is calculated, and similar features are clustered into one class. Then, the symmetric uncertainty correlation-based feature clustering model can be constructed in this second stage. In the third stage, a hybrid optimization approach of particle swarm optimizer (PSO) and wild horse optimizer (WHO) for feature selection is proposed, where the association-guided group initialization probability with a multiobjective optimized particle selection scheme is defined as a criterion for the PSO in selecting stallion particles for the WHO, and the improved WHO is developed by integrating the nonlinear inertial weight factor and the Brownian motion operator to obtain the optimal subset of selected features. Finally, a novel three-stage feature selection algorithm is developed. Experimental results apply to 16 datasets prove the efficiency of FSUHO in tackling high-dimensional feature selection problems in metrics of classification accuracy and running time.
引用
收藏
页码:4339 / 4360
页数:22
相关论文
共 67 条
  • [1] Binary Optimization Using Hybrid Grey Wolf Optimization for Feature Selection
    Al-Tashi, Qasem
    Kadir, Said Jadid Abdul
    Rais, Helmi Md
    Mirjalili, Seyedali
    Alhussian, Hitham
    [J]. IEEE ACCESS, 2019, 7 : 39496 - 39508
  • [2] Hybrid Filter-Wrapper Feature Selection Method for Sentiment Classification
    Ansari, Gunjan
    Ahmad, Tanvir
    Doja, Mohammad Najmud
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (11) : 9191 - 9208
  • [3] A Two-stage Text Feature Selection Algorithm for Improving Text Classification
    Ashokkumar, P.
    Shankar, Siva G.
    Srivastava, Gautam
    Maddikunta, Praveen Kumar Reddy
    Gadekallu, Thippa Reddy
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2021, 20 (03)
  • [4] Symmetric uncertainty class-feature association map for feature selection in microarray dataset
    Bakhshandeh, Soodeh
    Azmi, Reza
    Teshnehlab, Mohammad
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (01) : 15 - 32
  • [5] A novel control factor and Brownian motion-based improved Harris Hawks Optimization for feature selection
    Balakrishnan, K.
    Dhanalakshmi, R.
    Khaire, Utkarsh Mahadeo
    [J]. JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2022, 14 (7) : 8631 - 8653
  • [6] mRMR-PSO: A Hybrid Feature Selection Technique with a Multiobjective Approach for Sign Language Recognition
    BansalnAff, Sandhya Rani
    Wadhawan, Savita
    Goel, Rajeev
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2022, 47 (08) : 10365 - 10380
  • [7] Symmetric uncertainty based decomposition multi-objective immune algorithm for feature selection
    Chai, Zhengyi
    Li, Wangwang
    Li, Yalun
    [J]. SWARM AND EVOLUTIONARY COMPUTATION, 2023, 78
  • [8] An Evolutionary Multitasking-Based Feature Selection Method for High-Dimensional Classification
    Chen, Ke
    Xue, Bing
    Zhang, Mengjie
    Zhou, Fengyu
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (07) : 7172 - 7186
  • [9] Hybrid particle swarm optimization with spiral-shaped mechanism for feature selection
    Chen, Ke
    Zhou, Feng-Yu
    Yuan, Xian-Feng
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 128 : 140 - 156
  • [10] Gene selection and classification using Taguchi chaotic binary particle swarm optimization
    Chuang, Li-Yeh
    Yang, Cheng-San
    Wu, Kuo-Chuan
    Yang, Cheng-Hong
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (10) : 13367 - 13377