A Surrogate-Assisted Multi-Phase Ensemble Feature Selection Algorithm With Particle Swarm Optimization in Imbalanced Data

被引:0
作者
Song, Xianfang [1 ]
Jiang, Zhi [1 ]
Zhang, Yong [1 ]
Peng, Chao [1 ]
Guo, Yinan [2 ]
机构
[1] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221008, Peoples R China
[2] China Univ Min & Technol Beijing, Sch Mech Elect Informat Engn, Beijing 100083, Peoples R China
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2025年
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Particle swarm optimization; evolutionary comp-utation; feature selection; surrogate;
D O I
10.1109/TETCI.2025.3548786
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection based on evolutionary algorithm (EA) is an effective dimension reduction technology. However, existing EAs still constrained by high computational cost and easy-to-local convergence when handling high-dimensional data with imbalanced classes. To this end, a surrogate-assisted multi-phase ensemble feature selection algorithm with particle swarm optimization (SMEFS-PSO) is proposed, which combines the strengths of filter-based feature selection method, surrigate-assisted EA, and the local search strategy. In the first phase of SMEFS-PSO, an ensemble filter feature selection method is adapted to rapidly remove irrelevant and weakly-relevant features. Next, to reduce the identification cost of redundant features, a surrogate-assisted PSO is developed in the second phase. Then, a well-designed problem-specified local search strategy is introduced in the third phase to enhance the local capability. Furthermore, a representative instance selection strategy based on boundary distribution is developed, which construct a surrogate for the whole data. For majority classes, class boundary instances and center instances are selected as representative instances; while for minority classes, oversampling is used to select representative instances. The processing of imbalanced data is effectively integrated into the construction of the surrogate, which not only handles class imbalance problems but also greatly reduces the running cost of the second stage. Finally, the SMEFS-PSO is compared with 9 state-of-art feature selection algorithms on 16 benchmark problems. The experimental results demonstrate that the SMEFS-PSO has superior classification performance with less computing cost for high-dimensional imbalanced feature selection problem.
引用
收藏
页数:16
相关论文
共 74 条
[1]   Automatic ensemble feature selection using fast non-dominated sorting [J].
Abasabadi, Sedighe ;
Nematzadeh, Hossein ;
Motameni, Homayun ;
Akbari, Ebrahim .
INFORMATION SYSTEMS, 2021, 100
[2]   A hybrid fuzzy feature selection algorithm for high-dimensional regression problems: An mRMR-based framework [J].
Aghaeipoor, Fatemeh ;
Javidi, Mohammad Masoud .
EXPERT SYSTEMS WITH APPLICATIONS, 2020, 162
[3]   SFE: A Simple, Fast, and Efficient Feature Selection Algorithm for High-Dimensional Data [J].
Ahadzadeh, Behrouz ;
Abdar, Moloud ;
Safara, Fatemeh ;
Khosravi, Abbas ;
Menhaj, Mohammad Bagher ;
Suganthan, Ponnuthurai Nagaratnam .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2023, 27 (06) :1896-1911
[4]   Surrogate-Assisted Genetic Algorithm for Wrapper Feature Selection [J].
Altarabichi, Mohammed Ghaith ;
Nowaczyk, Slawomir ;
Pashami, Sepideh ;
Mashhadi, Peyman Sheikholharam .
2021 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC 2021), 2021, :776-785
[5]   RN-SMOTE: Reduced Noise SMOTE based on DBSCAN for enhancing imbalanced data classification [J].
Arafa, Ahmed ;
El-Fishawy, Nawal ;
Badawy, Mohammed ;
Radad, Marwa .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) :5059-5074
[6]  
Aram Y. K., 2023, Knoledge-Based Syetems, V267, P7409
[7]   Investigation on particle swarm optimisation for feature selection on high-dimensional data: local search and selection bias [J].
Binh Tran ;
Xue, Bing ;
Zhang, Mengjie ;
Su Nguyen .
CONNECTION SCIENCE, 2016, 28 (03) :270-294
[8]   A hybrid feature selection method based on Binary Jaya algorithm for micro-array data classification [J].
Chaudhuri, Abhilasha ;
Sahu, Tirath Prasad .
COMPUTERS & ELECTRICAL ENGINEERING, 2021, 90
[9]   Hybrid particle swarm optimization with spiral-shaped mechanism for feature selection [J].
Chen, Ke ;
Zhou, Feng-Yu ;
Yuan, Xian-Feng .
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 128 :140-156
[10]   Adaptive Feature Selection-Based AdaBoost-KNN With Direct Optimization for Dynamic Emotion Recognition in HumanRobot Interaction [J].
Chen, Luefeng ;
Li, Min ;
Su, Wanjuan ;
Wu, Min ;
Hirota, Kaoru ;
Pedrycz, Witold .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2021, 5 (02) :205-213