Feature selection with clustering probabilistic particle swarm optimization

被引:5
作者
Gao, Jinrui [1 ,2 ]
Wang, Ziqian [1 ,2 ]
Lei, Zhenyu [1 ,2 ]
Wang, Rong-Long [3 ,4 ]
Wu, Zhengwei [5 ,6 ]
Gao, Shangce [1 ,2 ]
机构
[1] Univ Toyama, Fac Engn, Toyama 9308555, Japan
[2] Toyama Univ, Gufuku 3190, Toyama, Toyama 9308555, Japan
[3] Univ Fukui, Fac Engn, Fukui 9108507, Japan
[4] Univ Fukui, 3-9-1 Bunkyo, Fukui, Fukui 9108507, Japan
[5] Tongji Univ, State Key Lab Marine Geol, Shanghai 200092, Peoples R China
[6] Tongji Univ, 1239 Siping Rd, Shanghai 200092, Jiangsu, Peoples R China
基金
日本学术振兴会; 日本科学技术振兴机构;
关键词
Feature selection; Particle swarm optimization; Classification; Clustering algorithms; ALGORITHM; CLASSIFICATION; PSO;
D O I
10.1007/s13042-024-02111-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Dealing with high-dimensional data poses a significant challenge in machine learning. To address this issue, researchers have proposed feature selection as a viable solution. Due to the intricate search space involved in feature selection, swarm intelligence algorithms have gained popularity for their exceptional search capabilities. This study introduces a method called Clustering Probabilistic Particle Swarm Optimization (CPPSO) to revolutionize the traditional particle swarm optimization approach by incorporating probabilities to represent velocity and incorporating an elitism mechanism. Furthermore, CPPSO employs a clustering strategy based on the K-means algorithm, utilizing the Hamming distance to divide the population into two sub-populations to improve the performance. To assess CPPSO's performance, a comparative analysis is conducted against seven existing algorithms using twenty diverse datasets. These datasets are all based on real-world problems. Fifteen of these are frequently used in feature selection research, while the remaining five comprise imbalanced datasets as well as multi-label datasets. The experimental results demonstrate the superiority of CPPSO across a range of evaluation criteria, surpassing the performance of the comparative algorithms on the majority of the datasets.
引用
收藏
页码:3599 / 3617
页数:19
相关论文
共 50 条
[1]   A new feature selection method to improve the document clustering using particle swarm optimization algorithm [J].
Abualigah, Laith Mohammad ;
Khader, Ahamad Tajudin ;
Hanandeh, Essam Said .
JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 25 :456-466
[2]   Feature selection based on a crow search algorithm for big data classification [J].
Al-Thanoon, Niam Abdulmunim ;
Algamal, Zakariya Yahya ;
Qasim, Omar Saber .
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2021, 212
[3]   Optimized feature selection algorithm based on fireflies with gravitational ant colony algorithm for big data predictive analytics [J].
AlFarraj, Osama ;
AlZubi, Ahmad ;
Tolba, Amr .
NEURAL COMPUTING & APPLICATIONS, 2019, 31 (05) :1391-1403
[4]   A framework for feature selection through boosting [J].
Alsahaf, Ahmad ;
Petkov, Nicolai ;
Shenoy, Vikram ;
Azzopardi, George .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 187
[5]   Hybrid binary whale with harris hawks for feature selection [J].
Alwajih, Ranya ;
Abdulkadir, Said Jadid ;
Al Hussian, Hitham ;
Aziz, Norshakirah ;
Al-Tashi, Qasem ;
Mirjalili, Seyedali ;
Alqushaibi, Alawi .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (21) :19377-19395
[6]   Optimizing multi-objective PSO based feature selection method using a feature elitism mechanism [J].
Amoozegar, Maryam ;
Minaei-Bidgoli, Behrouz .
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 113 :499-514
[7]   A survey on swarm intelligence approaches to feature selection in data mining [J].
Bach Hoai Nguyen ;
Xue, Bing ;
Zhang, Mengjie .
SWARM AND EVOLUTIONARY COMPUTATION, 2020, 54
[8]   Feature selection in machine learning: A new perspective [J].
Cai, Jie ;
Luo, Jiawei ;
Wang, Shulin ;
Yang, Sheng .
NEUROCOMPUTING, 2018, 300 :70-79
[9]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28
[10]  
Dash M., 1997, Intelligent Data Analysis, V1