An Intelligent Metaheuristic Binary Pigeon Optimization-Based Feature Selection and Big Data Classification in a MapReduce Environment

被引:23
作者
Abukhodair, Felwa [1 ]
Alsaggaf, Wafaa [1 ]
Jamal, Amani Tariq [2 ]
Abdel-Khalek, Sayed [3 ,4 ]
Mansour, Romany F. [5 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Dept Informat Technol, Jeddah 21589, Saudi Arabia
[2] King Abdulaziz Univ, Fac Comp & Informat Technol, Dept Comp Sci, Jeddah 21589, Saudi Arabia
[3] Taif Univ, Coll Sci, Dept Math & Stat, POB 11099, At Taif 21944, Saudi Arabia
[4] Sohag Univ, Fac Sci, Dept Math, Sohag 82524, Egypt
[5] New Valley Univ, Fac Sci, Dept Math, El Kharga 72511, Egypt
关键词
big data; metaheuristics; feature selection; Hadoop; MapReduce; data classification;
D O I
10.3390/math9202627
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Big Data are highly effective for systematically extracting and analyzing massive data. It can be useful to manage data proficiently over the conventional data handling approaches. Recently, several schemes have been developed for handling big datasets with several features. At the same time, feature selection (FS) methodologies intend to eliminate repetitive, noisy, and unwanted features that degrade the classifier results. Since conventional methods have failed to attain scalability under massive data, the design of new Big Data classification models is essential. In this aspect, this study focuses on the design of metaheuristic optimization based on big data classification in a MapReduce (MOBDC-MR) environment. The MOBDC-MR technique aims to choose optimal features and effectively classify big data. In addition, the MOBDC-MR technique involves the design of a binary pigeon optimization algorithm (BPOA)-based FS technique to reduce the complexity and increase the accuracy. Beetle antenna search (BAS) with long short-term memory (LSTM) model is employed for big data classification. The presented MOBDC-MR technique has been realized on Hadoop with the MapReduce programming model. The effective performance of the MOBDC-MR technique was validated using a benchmark dataset and the results were investigated under several measures. The MOBDC-MR technique demonstrated promising performance over the other existing techniques under different dimensions.</p>
引用
收藏
页数:14
相关论文
共 25 条
  • [1] A hybrid Harris Hawks optimization algorithm with simulated annealing for feature selection
    Abdel-Basset, Mohamed
    Ding, Weiping
    El-Shahat, Doaa
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (01) : 593 - 637
  • [2] Al-Sarem Mohammed, 2021, Advances on Smart and Soft Computing. Proceedings of ICACIn 2020. Advances in Intelligent Systems and Computing (AISC 1188), P189, DOI 10.1007/978-981-15-6048-4_17
  • [3] A big data approach to sentiment analysis using greedy feature selection with cat swarm optimization-based long short-term memory neural networks
    Alarifi, Abdulaziz
    Tolba, Amr
    Al-Makhadmeh, Zafer
    Said, Wael
    [J]. JOURNAL OF SUPERCOMPUTING, 2020, 76 (06) : 4414 - 4429
  • [4] High-dimensional QSAR/QSPR classification modeling based on improving pigeon optimization algorithm
    Algamal, Zakariya Yahya
    Qasim, Maimoonah Khalid
    Lee, Muhammad Hisyam
    Ali, Haithem Taha Mohammad
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2020, 206 (206)
  • [5] The monarch butterfly optimization algorithm for solving feature selection problems
    Alweshah, Mohammed
    Al Khalaileh, Saleh
    Gupta, Brij B.
    Almomani, Ammar
    Hammouri, Abdelaziz, I
    Al-Betar, Mohammed Azmi
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (14) : 11267 - 11281
  • [6] Online feature selection system for big data classification based on multi-objective automated negotiation
    BenSaid, Fatma
    Alimi, Adel M.
    [J]. PATTERN RECOGNITION, 2021, 110
  • [7] Evolutionary computation for feature selection in classification problems
    de la Iglesia, Beatriz
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2013, 3 (06) : 381 - 407
  • [8] MapReduce: A Flexible Data Processing Tool
    Dean, Jeffrey
    Ghemawat, Sanjay
    [J]. COMMUNICATIONS OF THE ACM, 2010, 53 (01) : 72 - 77
  • [9] An efficient ACO-PSO-based framework for data classification and preprocessing in big data
    Dubey, Ashutosh Kumar
    Kumar, Abhishek
    Agrawal, Rashmi
    [J]. EVOLUTIONARY INTELLIGENCE, 2021, 14 (02) : 909 - 922
  • [10] Improved Feature Selection Model for Big Data Analytics
    El-Hasnony, Ibrahim M.
    Barakat, Sherif I.
    Elhoseny, Mohamed
    Mostafa, Reham R.
    [J]. IEEE ACCESS, 2020, 8 : 66989 - 67004