Feature selection using symmetric uncertainty and hybrid optimization for high-dimensional data

被引:6
作者
Sun, Lin [1 ,3 ]
Sun, Shujing [1 ]
Ding, Weiping [2 ]
Huang, Xinyue [1 ]
Fan, Peiyi [1 ]
Li, Kunyu [1 ]
Chen, Leqi [1 ]
机构
[1] Henan Normal Univ, Coll Comp & Informat Engn, Xinxiang 453007, Peoples R China
[2] Nantong Univ, Sch Informat Sci & Technol, Nantong 226019, Peoples R China
[3] Tianjin Univ Sci & Technol, Coll Artificial Intelligence, Tianjin 300457, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Symmetric uncertainty; Feature clustering; Hybrid optimization; High-dimensional data; INFORMATION; ALGORITHM;
D O I
10.1007/s13042-023-01897-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, when handling high-dimensional data, it has become extremely difficult to search this optimal subset of selected features due to the restriction of reducing the exponential increase of the search procedure, and most of those feature selection models neglect the interactions of features or feature and decision class. This paper develops a novel feature selection approach using symmetric uncertainty and hybrid optimization for high-dimensional data (FSUHO) for high-dimensional data. First, to fully reflect the interaction relationship of features or feature and decision class, the F-relevance between features and the C-correlation between feature and decision class based on the symmetric uncertainty are constructed to remove those redundant features. Then, a strong correlation threshold is improved based on the C-correlation and random coefficient to prevent the removal of the effective features in this first stage. Second, to decrease this expensive computational consumption, one criterion for judging a weakly correlated feature is designed to sort all features, and another criterion is developed to select the class center. The similarity between features and class centers is calculated, and similar features are clustered into one class. Then, the symmetric uncertainty correlation-based feature clustering model can be constructed in this second stage. In the third stage, a hybrid optimization approach of particle swarm optimizer (PSO) and wild horse optimizer (WHO) for feature selection is proposed, where the association-guided group initialization probability with a multiobjective optimized particle selection scheme is defined as a criterion for the PSO in selecting stallion particles for the WHO, and the improved WHO is developed by integrating the nonlinear inertial weight factor and the Brownian motion operator to obtain the optimal subset of selected features. Finally, a novel three-stage feature selection algorithm is developed. Experimental results apply to 16 datasets prove the efficiency of FSUHO in tackling high-dimensional feature selection problems in metrics of classification accuracy and running time.
引用
收藏
页码:4339 / 4360
页数:22
相关论文
共 67 条
[21]   Automatic Features Extraction Integrated With Exact Gaussian Process for Respiratory Rate and Uncertainty Estimations [J].
Lee, Soojeong ;
Lee, Gangseong .
IEEE ACCESS, 2023, 11 :2754-2766
[22]   Feature Selection: A Data Perspective [J].
Li, Jundong ;
Cheng, Kewei ;
Wang, Suhang ;
Morstatter, Fred ;
Trevino, Robert P. ;
Tang, Jiliang ;
Liu, Huan .
ACM COMPUTING SURVEYS, 2018, 50 (06)
[23]   Interval Dominance-Based Feature Selection for Interval-Valued Ordered Data [J].
Li, Wentao ;
Zhou, Haoxiang ;
Xu, Weihua ;
Wang, Xi-Zhao ;
Pedrycz, Witold .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (10) :6898-6912
[24]   Hybrid Multi-Strategy Improved Wild Horse Optimizer [J].
Li, Yancang ;
Yuan, Qiuyu ;
Han, Muxuan ;
Cui, Rong .
ADVANCED INTELLIGENT SYSTEMS, 2022, 4 (10)
[25]   A Supervised Feature Selection Algorithm through Minimum Spanning Tree Clustering [J].
Liu, Qin ;
Zhang, Jingxiao ;
Xiao, Jiakai ;
Zhu, Hongming ;
Zhao, Qinpei .
2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, :264-271
[26]   A two-stage hybrid ant colony optimization for high-dimensional feature selection [J].
Ma, Wenping ;
Zhou, Xiaobo ;
Zhu, Hao ;
Li, Longwei ;
Jiao, Licheng .
PATTERN RECOGNITION, 2021, 116
[27]  
Mao Q. H., 2021, Chinese Journal of Frontiers of Computer Science and Technology, V15, P1155, DOI 10.3778/j.issn.1673-9418.2010032
[28]   Wild horse optimizer: a new meta-heuristic algorithm for solving engineering optimization problems [J].
Naruei, Iraj ;
Keynia, Farshid .
ENGINEERING WITH COMPUTERS, 2022, 38 (SUPPL 4) :3025-3056
[29]   Unsupervised fuzzy multivariate symmetric uncertainty feature selection based on constructing virtual cluster representative [J].
Rahmanian, Mohsen ;
Mansoori, Eghbal .
FUZZY SETS AND SYSTEMS, 2022, 438 :148-163
[30]   A Particle Swarm Algorithm Based on a Multi-Stage Search Strategy [J].
Shen, Yong ;
Cai, Wangzhen ;
Kang, Hongwei ;
Sun, Xingping ;
Chen, Qingyi ;
Zhang, Haigang .
ENTROPY, 2021, 23 (09)