Serial filter-wrapper feature selection method with elite guided mutation strategy on cancer gene expression data

被引:1
|
作者
Song, Yu-Wei [1 ]
Wang, Jie-Sheng [1 ]
Qi, Yu-Liang [1 ]
Wang, Yu-Cai [1 ]
Song, Hao-Ming [1 ]
Shang-Guan, Yi-Peng [1 ]
机构
[1] Univ Sci & Technol Liaoning, Sch Elect & Informat Engn, Anshan, Liaoning, Peoples R China
关键词
Feature selection; Cancer gene expression; Equilibrium optimizer; Parallel filter methods; Elite guided mutation strategies; Serial hybrid frameworks; CLASSIFICATION; ALGORITHM;
D O I
10.1007/s10462-024-11029-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, many researchers utilize cancer gene expression data to solve the problem of cancer subtype diagnosis, but cancer gene expression data are often high-dimensional, multi-sample, and multi-classified, so a hybrid serial filter-wrapper feature selection (FS) method based on elite guided mutation strategy for cancer gene expression data is proposed. It is divided into a preliminary screening phase and a combined modeling phase. In the preliminary screening stage, the threshold values of seven filter methods are determined by the leave-one cross-validation method, and the features selected by these seven filter methods are combined to form two subsets by using the thoughts of ''And'' and ''Or'' in the logical operation. The union subset of two subsets is used in the equilibrium optimizer (EO) in the subsequent combination model stage as the reserved subset in the preliminary screening stage. The resulting hybrid framework is connected by a parallel filter method designed in the first stage with an improved EO in the second stage, which is named as SFEMEO. In order to prove the effectiveness and generalization of the proposed SFEMEO, it is compared with other 9 basic algorithms on 10 UCI data sets. It is found that the classification accuracy of the SFEMEO is improved by 0.56% similar to 20.19%, and the optimal fitness is also greatly improved. After comparing SFEMEO with other nine intelligent optimization algorithms on ten cancer gene expression data sets, it can be found that compared with most algorithms, the accuracy rate is improved by 3.73% similar to 18.13%, and the optimal fitness is relatively superior. At the same time, Wilcoxon rank sum test was used to evaluate the results of intelligent optimization algorithms such as SFEMEO, which proved the effectiveness of the proposed hybrid framework and its superiority in solving the FS problem of high-dimensional cancer gene expression data.
引用
收藏
页数:49
相关论文
共 50 条
  • [1] A Novel Filter-Wrapper Based Feature Selection Approach for Cancer Data Classification
    Mufassirin, M. M. Mohamed
    Ragel, Roshan G.
    2018 IEEE 9TH INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION FOR SUSTAINABILITY (ICIAFS' 2018), 2018,
  • [2] Feature subset selection Filter-Wrapper based on low quality data
    Cadenas, Jose M.
    Carmen Garrido, M.
    Martinez, Raquel
    EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (16) : 6241 - 6252
  • [3] A Hybrid Filter/Wrapper Approach of Feature Selection for Gene Expression Data
    Ke, Chao-Hsuan
    Yang, Cheng-Hong
    Chuang, Li-Yeh
    Yang, Cheng-San
    2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 2663 - +
  • [4] A new hybrid filter-wrapper feature selection method for clustering based on ranking
    Solorio-Fernandez, Saul
    Ariel Carrasco-Ochoa, J.
    Fco. Martinez-Trinidad, Jose
    NEUROCOMPUTING, 2016, 214 : 866 - 880
  • [5] An efficient hybrid filter-wrapper method based on improved Harris Hawks optimization for feature selection
    Pirgazi, Jamshid
    Kallehbasti, Mohammad Mehdi Pourhashem
    Sorkhi, Ali Ghanbari
    Kermani, Ali
    BIOIMPACTS, 2024,
  • [6] Improved swarm-optimization-based filter-wrapper gene selection from microarray data for gene expression tumor classification
    Ke, Lin
    Li, Min
    Wang, Lei
    Deng, Shaobo
    Ye, Jun
    Yu, Xiang
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (02) : 455 - 472
  • [7] An Efficient hybrid filter-wrapper metaheuristic-based gene selection method for high dimensional datasets
    Pirgazi, Jamshid
    Alimoradi, Mohsen
    Abharian, Tahereh Esmaeili
    Olyaee, Mohammad Hossein
    SCIENTIFIC REPORTS, 2019, 9 (1)
  • [8] An Enhanced Binary Multiobjective Hybrid Filter-Wrapper Chimp Optimization Based Feature Selection Method for COVID-19 Patient Health Prediction
    Piri, Jayashree
    Mohapatra, Puspanjali
    Singh, Harprith Kaur Rajinder
    Acharya, Biswaranjan
    Patra, Tapas Kumar
    IEEE ACCESS, 2022, 10 : 100376 - 100396
  • [9] Null space based feature selection method for gene expression data
    Sharma, Alok
    Imoto, Seiya
    Miyano, Satoru
    Sharma, Vandana
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2012, 3 (04) : 269 - 276
  • [10] Multi-strategy fusion binary SHO guided by Pearson correlation coefficient for feature selection with cancer gene expression data
    Wang, Yu-Cai
    Song, Hao-Ming
    Wang, Jie-Sheng
    Ma, Xin-Ru
    Song, Yu-Wei
    Qi, Yu-Liang
    EGYPTIAN INFORMATICS JOURNAL, 2025, 29