Serial filter-wrapper feature selection method with elite guided mutation strategy on cancer gene expression data

被引:1
|
作者
Song, Yu-Wei [1 ]
Wang, Jie-Sheng [1 ]
Qi, Yu-Liang [1 ]
Wang, Yu-Cai [1 ]
Song, Hao-Ming [1 ]
Shang-Guan, Yi-Peng [1 ]
机构
[1] Univ Sci & Technol Liaoning, Sch Elect & Informat Engn, Anshan, Liaoning, Peoples R China
关键词
Feature selection; Cancer gene expression; Equilibrium optimizer; Parallel filter methods; Elite guided mutation strategies; Serial hybrid frameworks; CLASSIFICATION; ALGORITHM;
D O I
10.1007/s10462-024-11029-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, many researchers utilize cancer gene expression data to solve the problem of cancer subtype diagnosis, but cancer gene expression data are often high-dimensional, multi-sample, and multi-classified, so a hybrid serial filter-wrapper feature selection (FS) method based on elite guided mutation strategy for cancer gene expression data is proposed. It is divided into a preliminary screening phase and a combined modeling phase. In the preliminary screening stage, the threshold values of seven filter methods are determined by the leave-one cross-validation method, and the features selected by these seven filter methods are combined to form two subsets by using the thoughts of ''And'' and ''Or'' in the logical operation. The union subset of two subsets is used in the equilibrium optimizer (EO) in the subsequent combination model stage as the reserved subset in the preliminary screening stage. The resulting hybrid framework is connected by a parallel filter method designed in the first stage with an improved EO in the second stage, which is named as SFEMEO. In order to prove the effectiveness and generalization of the proposed SFEMEO, it is compared with other 9 basic algorithms on 10 UCI data sets. It is found that the classification accuracy of the SFEMEO is improved by 0.56% similar to 20.19%, and the optimal fitness is also greatly improved. After comparing SFEMEO with other nine intelligent optimization algorithms on ten cancer gene expression data sets, it can be found that compared with most algorithms, the accuracy rate is improved by 3.73% similar to 18.13%, and the optimal fitness is relatively superior. At the same time, Wilcoxon rank sum test was used to evaluate the results of intelligent optimization algorithms such as SFEMEO, which proved the effectiveness of the proposed hybrid framework and its superiority in solving the FS problem of high-dimensional cancer gene expression data.
引用
收藏
页数:49
相关论文
共 50 条
  • [21] Distributed feature selection (DFS) strategy for microarray gene expression data to improve the classification performance
    Potharaju, Sai Prasad
    Sreedevi, M.
    CLINICAL EPIDEMIOLOGY AND GLOBAL HEALTH, 2019, 7 (02): : 171 - 176
  • [22] RETRACTED: A wrapper based feature selection in bone marrow plasma cell gene expression data (Retracted Article)
    Ragunthar, T.
    Selvakumar, S.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 6): : 13785 - 13796
  • [23] An improved conditional relevance and weighted redundancy feature selection method for gene expression data
    Qin, Xiwen
    Zhang, Siqi
    Dong, Xiaogang
    Luo, Tingru
    Shi, Hongyu
    Yuan, Liping
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01)
  • [24] Feature selection methods on gene expression microarray data for cancer classification: A systematic review
    Alhenawi, Esra'a
    Al-Sayyed, Rizik
    Hudaib, Amjad
    Mirjalili, Seyedali
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 140
  • [25] Null space based feature selection method for gene expression data
    Alok Sharma
    Seiya Imoto
    Satoru Miyano
    Vandana Sharma
    International Journal of Machine Learning and Cybernetics, 2012, 3 : 269 - 276
  • [26] Mixture feature selection strategy applied in cancer classification from gene expression
    Jin, Xing
    Deng, Yufeng
    Zhong, yixin
    2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7, 2005, : 4807 - 4809
  • [27] PF-PSS: a double-layer parallel embedded feature selection method for cancer gene expression data
    Yu-Wei Song
    Jie-Sheng Wang
    Yu-Liang Qi
    Yu-Cai Wang
    Shi Li
    Hao-Ming Song
    Yi-Peng Shang-Guan
    Journal of Big Data, 12 (1)
  • [28] A Comprehensive Survey of Recent Hybrid Feature Selection Methods in Cancer Microarray Gene Expression Data
    Almazrua, Halah
    Alshamlan, Hala
    IEEE ACCESS, 2022, 10 : 71427 - 71449
  • [29] GOG-MBSHO: multi-strategy fusion binary sea-horse optimizer with Gaussian transfer function for feature selection of cancer gene expression data
    Wang, Yu-Cai
    Song, Hao-Ming
    Wang, Jie-Sheng
    Song, Yu-Wei
    Qi, Yu-Liang
    Ma, Xin-Ru
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (12)
  • [30] FG-HFS: A feature filter and group evolution hybrid feature selection algorithm for high-dimensional gene expression data
    Xu, Zhaozhao
    Yang, Fangyuan
    Tang, Chaosheng
    Wang, Hong
    Wang, Shuihua
    Sun, Junding
    Zhang, Yudong
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 245