bSSA: Binary Salp Swarm Algorithm With Hybrid Data Transformation for Feature Selection

被引:38
作者
Shekhawat, Sayar Singh [1 ]
Sharma, Harish [1 ]
Kumar, Sandeep [2 ]
Nayyar, Anand [3 ,4 ]
Qureshi, Basit [5 ]
机构
[1] Rajasthan Tech Univ, Dept Comp Sci & Engn, Kota 324010, India
[2] CHRIST Deemed Univ, Dept Comp Sci & Engn, Bengaluru 560074, India
[3] Duy Tan Univ, Grad Sch, Da Nang 550000, Vietnam
[4] Duy Tan Univ, Fac Informat Technol, Da Nang 550000, Vietnam
[5] Prince Sultan Univ, Dept Comp Sci, Riyadh 11586, Saudi Arabia
关键词
Feature extraction; Principal component analysis; Optimization; Transforms; Genetic algorithms; Computer science; Support vector machines; Data transformation; fast independent component analysis; feature selection; principal component analysis; salp swarm optimizer;
D O I
10.1109/ACCESS.2021.3049547
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature selection is a technique commonly used in Data Mining and Machine Learning. Traditional feature selection methods, when applied to large datasets, generate a large number of feature subsets. Selecting optimal features within this high dimensional data space is time-consuming and negatively affects the system's performance. This paper proposes a new binary Salp Swarm Algorithm (bSSA) for selecting the best feature set from transformed datasets. The proposed feature selection method first transforms the original data-set using Principal Component Analysis (PCA) and fast Independent Component Analysis (fastICA) based hybrid data transformation methods; next, a binary Salp Swarm optimizer is used for finding the best features. The proposed feature selection approach improves accuracy and eliminates the selection of irrelevant features. We validate our technique on fifteen different benchmark data sets. We conduct an extensive study to measure the performance and feature selection accuracy of the proposed technique. The proposed bSSA is compared to Binary Genetic Algorithm (bGA), Binary Binomial Cuckoo Search (bBCS), Binary Grey Wolf Optimizer (bGWO), Binary Competitive Swarm Optimizer (bCSO), and Binary Crow Search Algorithm (bCSA). The proposed method attains a mean accuracy of 95.26% with 7.78% features on PCA-fastICA transformed datasets. The results show that bSSA outperforms the existing methods for the majority of the performance measures.
引用
收藏
页码:14867 / 14882
页数:16
相关论文
共 88 条
[1]   Quantum based Whale Optimization Algorithm for wrapper feature selection [J].
Agrawal, R. K. ;
Kaur, Baljeet ;
Sharma, Surbhi .
APPLIED SOFT COMPUTING, 2020, 89
[2]   The monarch butterfly optimization algorithm for solving feature selection problems [J].
Alweshah, Mohammed ;
Al Khalaileh, Saleh ;
Gupta, Brij B. ;
Almomani, Ammar ;
Hammouri, Abdelaziz, I ;
Al-Betar, Mohammed Azmi .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (14) :11267-11281
[3]   A hybrid mine blast algorithm for feature selection problems [J].
Alweshah, Mohammed ;
Alkhalaileh, Saleh ;
Albashish, Dheeb ;
Mafarja, Majdi ;
Bsoul, Qusay ;
Dorgham, Osama .
SOFT COMPUTING, 2021, 25 (01) :517-534
[4]  
[Anonymous], 2013, IJCAI '13
[5]   Non-Gaussianity from inflation: theory and observations [J].
Bartolo, N ;
Komatsu, E ;
Matarrese, S ;
Riotto, A .
PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2004, 402 (3-4) :103-266
[6]   Genetic programming for feature construction and selection in classification on high-dimensional data [J].
Binh Tran ;
Xue, Bing ;
Zhang, Mengjie .
MEMETIC COMPUTING, 2016, 8 (01) :3-15
[7]   Comparison between Principal Component Analysis and independent component analysis in electroencephalograms modelling [J].
Bugli, C. ;
Lambert, P. .
BIOMETRICAL JOURNAL, 2007, 49 (02) :312-327
[8]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28
[9]  
Chang XJ, 2014, AAAI CONF ARTIF INTE, P1171
[10]   Independent component analysis and clustering for pollution data [J].
Chattopadhyay, Asis Kumar ;
Mondal, Saptarshi ;
Biswas, Atanu .
ENVIRONMENTAL AND ECOLOGICAL STATISTICS, 2015, 22 (01) :33-43