Improved intelligent water drop-based hybrid feature selection method for data

被引:12
|
作者
Alhenawi, Esra'a [1 ,2 ]
Al-Sayyed, Rizik [2 ]
Hudaib, Amjad [2 ]
Mirjalili, Seyedali [3 ,4 ]
机构
[1] Al Ahliyya Amman Univ, Software Engn Dept, Amman, Jordan
[2] Univ Jordan, King AbdullahSchool Informat Technol 2, Amman, Jordan
[3] Torrens Univ Australia, Ctr Artificial Intelligence Res & Optimizat, Fortitude Valley, Brisbane, Qld 4006, Australia
[4] Univ Res, Obuda Univ, Innovat Ctr, Budapest, Hungary
关键词
Machine learning; Intelligent water drop algorithm; Hybrid feature selection; High dimensional datasets; Medical applications; OPTIMIZATION; ALGORITHM; ENSEMBLE; SEARCH;
D O I
10.1016/j.compbiolchem.2022.107809
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Classifying microarray datasets, which usually contains many noise genes that degrade the performance of classifiers and decrease classification accuracy rate, is a competitive research topic. Feature selection (FS) is one of the most practical ways for finding the most optimal subset of genes that increases classification's accuracy for diagnostic and prognostic prediction of tumor cancer from the microarray datasets. This means that we always need to develop more efficient FS methods, that select only optimal or close-to-optimal subset of features to improve classification performance. In this paper, we propose a hybrid FS method for microarray data processing, that combines an ensemble filter with an Improved Intelligent Water Drop (IIWD) algorithm as a wrapper by adding one of three local search (LS) algorithms: Tabu search (TS), Novel LS algorithm (NLSA), or Hill Climbing (HC) in each iteration from IWD, and using a correlation coefficient filter as a heuristic undesirability (HUD) for next node selection in the original IWD algorithm. The effects of adding three different LS algorithms to the proposed IIWD algorithm have been evaluated through comparing the performance of the proposed ensemble filter-IIWD-based wrapper without adding any LS algorithms named (PHFS-IWD) FS method versus its performance when adding a specific LS algorithm from (TS, NLSA or HC) in FS methods named, (PHFS-IWDTS, PHFS-IWDNLSA, and PHFS-IWDHC), respectively. Naive Bayes(NB) classifier with five microarray datasets have been deployed for evaluating and comparing the proposed hybrid FS methods. Results show that using LS algorithms in each iteration from the IWD algorithm improves F-score value with an average equal to 5% compared with PHFS-IWD. Also, PHFS-IWDNLSA improves the F-score value with an average of 4.15% over PHFS-IWDTS, and 5.67% over PHFS-IWDHC while PHFS-IWDTS outperformed PHFS-IWDHC with an average of increment equal to 1.6%. On the other hand, the proposed hybrid-based FS methods improve accuracy with an average equal to 8.92% in three out of five datasets and decrease the number of genes with a percentage of 58.5% in all five datasets compared with six of the most recent state-of-the-art FS methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [21] Improved Feature Selection Model for Big Data Analytics
    El-Hasnony, Ibrahim M.
    Barakat, Sherif I.
    Elhoseny, Mohamed
    Mostafa, Reham R.
    IEEE ACCESS, 2020, 8 : 66989 - 67004
  • [22] Path Planning of Inspection Robot Based on Improved Intelligent Water Drop Algorithm
    Zhang, Xuhui
    Ji, Ying
    Wang, Chunyang
    Lin, Haijun
    Wang, Yueqiang
    IEEE ACCESS, 2023, 11 : 119993 - 120000
  • [23] An efficient hybrid filter-wrapper method based on improved Harris Hawks optimization for feature selection
    Pirgazi, Jamshid
    Kallehbasti, Mohammad Mehdi Pourhashem
    Sorkhi, Ali Ghanbari
    Kermani, Ali
    BIOIMPACTS, 2024,
  • [24] Mean based relief: An improved feature selection method based on ReliefF
    Nitisha Aggarwal
    Unmesh Shukla
    Geetika Jain Saxena
    Mukesh Rawat
    Anil Singh Bafila
    Sanjeev Singh
    Amit Pundir
    Applied Intelligence, 2023, 53 : 23004 - 23028
  • [25] Intelligent Hybrid Feature Selection for Textual Sentiment Classification
    Khan, Jawad
    Alam, Aftab
    Lee, Youngmoon
    IEEE ACCESS, 2021, 9 : 140590 - 140608
  • [26] A Classification Method Based on Feature Selection for Imbalanced Data
    Liu, Yi
    Wang, Yanzhen
    Ren, Xiaoguang
    Zhou, Hao
    Diao, Xingchun
    IEEE ACCESS, 2019, 7 : 81794 - 81807
  • [27] Hybrid Feature Generation and Selection with a Focus on Novel Genetic-Based Generated Feature Method for Modeling Products in the Sulfur Recovery Unit
    Moayedi, Farshad
    Abolghasemi, Hossein
    Shokri, Saeid
    Ganji, Hamid
    Hamedi, Amir Hossein
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (07) : 9023 - 9034
  • [28] Clustering-based hybrid feature selection approach for high dimensional microarray data
    Babu, Samson Anosh P.
    Annavarapu, Chandra Sekhara Rao
    Dara, Suresh
    CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2021, 213
  • [29] Information granularity-based incremental feature selection for partially labeled hybrid data
    Shu, Wenhao
    Yan, Zhenchao
    Chen, Ting
    Yu, Jianhui
    Qian, Wenbin
    INTELLIGENT DATA ANALYSIS, 2022, 26 (01) : 33 - 56
  • [30] A Fast Hybrid Feature Selection Method
    Ganjei, Mohammad Ahmadi
    Boostani, Reza
    2019 9TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE 2019), 2019, : 6 - 11