Serial filter-wrapper feature selection method with elite guided mutation strategy on cancer gene expression data

被引:1
|
作者
Song, Yu-Wei [1 ]
Wang, Jie-Sheng [1 ]
Qi, Yu-Liang [1 ]
Wang, Yu-Cai [1 ]
Song, Hao-Ming [1 ]
Shang-Guan, Yi-Peng [1 ]
机构
[1] Univ Sci & Technol Liaoning, Sch Elect & Informat Engn, Anshan, Liaoning, Peoples R China
关键词
Feature selection; Cancer gene expression; Equilibrium optimizer; Parallel filter methods; Elite guided mutation strategies; Serial hybrid frameworks; CLASSIFICATION; ALGORITHM;
D O I
10.1007/s10462-024-11029-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, many researchers utilize cancer gene expression data to solve the problem of cancer subtype diagnosis, but cancer gene expression data are often high-dimensional, multi-sample, and multi-classified, so a hybrid serial filter-wrapper feature selection (FS) method based on elite guided mutation strategy for cancer gene expression data is proposed. It is divided into a preliminary screening phase and a combined modeling phase. In the preliminary screening stage, the threshold values of seven filter methods are determined by the leave-one cross-validation method, and the features selected by these seven filter methods are combined to form two subsets by using the thoughts of ''And'' and ''Or'' in the logical operation. The union subset of two subsets is used in the equilibrium optimizer (EO) in the subsequent combination model stage as the reserved subset in the preliminary screening stage. The resulting hybrid framework is connected by a parallel filter method designed in the first stage with an improved EO in the second stage, which is named as SFEMEO. In order to prove the effectiveness and generalization of the proposed SFEMEO, it is compared with other 9 basic algorithms on 10 UCI data sets. It is found that the classification accuracy of the SFEMEO is improved by 0.56% similar to 20.19%, and the optimal fitness is also greatly improved. After comparing SFEMEO with other nine intelligent optimization algorithms on ten cancer gene expression data sets, it can be found that compared with most algorithms, the accuracy rate is improved by 3.73% similar to 18.13%, and the optimal fitness is relatively superior. At the same time, Wilcoxon rank sum test was used to evaluate the results of intelligent optimization algorithms such as SFEMEO, which proved the effectiveness of the proposed hybrid framework and its superiority in solving the FS problem of high-dimensional cancer gene expression data.
引用
收藏
页数:49
相关论文
共 50 条
  • [41] Feature selection using non-dominant features-guided search for gene expression profile data
    Pan, Xiaoying
    Sun, Jun
    Yu, Huimin
    Xue, Yufeng
    COMPLEX & INTELLIGENT SYSTEMS, 2023, 9 (06) : 6139 - 6153
  • [42] Hybrid Feature Selection Method for Predicting Alzheimer?s Disease Using Gene Expression Data
    El-Gawady, Aliaa
    Tawfik, BenBella S.
    Makhlouf, Mohamed A.
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (03): : 5559 - 5572
  • [43] Use of SVM-based ensemble feature selection method for gene expression data analysis
    Zhang, Shizhi
    Zhang, Mingjin
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2022, 21 (01)
  • [44] Feature selection using non-dominant features-guided search for gene expression profile data
    Xiaoying Pan
    Jun Sun
    Huimin Yu
    Yufeng Xue
    Complex & Intelligent Systems, 2023, 9 : 6139 - 6153
  • [45] Feature selection of gene expression data for Cancer classification using double RBF-kernels
    Liu, Shenghui
    Xu, Chunrui
    Zhang, Yusen
    Liu, Jiaguo
    Yu, Bin
    Liu, Xiaoping
    Dehmer, Matthias
    BMC BIOINFORMATICS, 2018, 19
  • [46] Feature selection of gene expression data for Cancer classification using double RBF-kernels
    Shenghui Liu
    Chunrui Xu
    Yusen Zhang
    Jiaguo Liu
    Bin Yu
    Xiaoping Liu
    Matthias Dehmer
    BMC Bioinformatics, 19
  • [47] GENE EXPRESSION DATA ANALYSIS USING PSEUDO STANDARD DEVIATION MINIMIZATION FEATURE FUSION METHOD FOR CANCER DIAGNOSIS
    Piao, Haiyan
    JOURNAL OF MECHANICS IN MEDICINE AND BIOLOGY, 2012, 12 (01)
  • [48] A feature selection method using fixed-point algorithm for DNA microarray gene expression data
    Sharma, Alok
    Paliwal, Kuldip K.
    Imoto, Seiya
    Miyano, Satoru
    Sharma, Vandana
    Ananthanarayanan, Rajeshkannan
    INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2014, 18 (01) : 55 - 59
  • [49] Unsupervised feature selection algorithm for multiclass cancer classification of gene expression RNA-Seq data
    Garcia-Diaz, Pilar
    Sanchez-Berriel, Isabel
    Martinez-Rojas, Juan A.
    Diez-Pascual, Ana M.
    GENOMICS, 2020, 112 (02) : 1916 - 1925
  • [50] Dimension Reduction and Classifier-Based Feature Selection for Oversampled Gene Expression Data and Cancer Classification
    Petinrin, Olutomilayo Olayemi
    Saeed, Faisal
    Salim, Naomie
    Toseef, Muhammad
    Liu, Zhe
    Muyide, Ibukun Omotayo
    PROCESSES, 2023, 11 (07)