Optimized feature selection method using particle swarm intelligence with ensemble learning for cancer classification based on microarray datasets

被引:29
作者
Alrefai, Nashat [1 ]
Ibrahim, Othman [1 ,2 ]
机构
[1] Univ Teknol Malaysia UTM, Fac Engn, Sch Comp, Johor Baharu 81310, Johor, Malaysia
[2] Univ Teknol Malaysia UTM, Azman Hashim Int Business Sch, Skudai 81310, Johor, Malaysia
关键词
Feature selection; Ensemble learning; Cancer classification; Particle swarm optimization; Microarray; GENETIC ALGORITHM; PREDICTION; DIVERSITY; DIAGNOSIS; PATTERNS; TUMOR;
D O I
10.1007/s00521-022-07147-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cancer is considered a leading cause of mortality in both developed and developing countries. Cancer classification based on the microarray dataset has provided insight into possible treatment strategies. A complicated and high-dimensional number of genes and a few numbers of instances are characteristics of the microarray datasets. Gene selection is therefore a challenging and required task for the data analysis of microarray expression. The selection of genes may reveal insight into the underlying mechanism of a particular biological phenomenon. Several academics have recently developed methods of feature selection, utilizing metaheuristic algorithms for interpreting and analyzing microarray data. Nevertheless, due to the few numbers of samples in microarray data compared to the high dimensionality, several data mining approaches have been unsuccessful to select the most relevant and informatics genes. As a result, incorporating various classifiers can enhance feature selection and classification performance. The current study aims to propose a method for cancer classification by employing ensemble learning. Hence, in this paper, particle swarm optimization and an ensemble learning method collaborate for feature selection and cancer classification. As a result, the analysis indicates the effectiveness of the proposed method for cancer classification based on microarray datasets, and in terms of accuracy, the performance outcomes are 100%, 92.86%, 86.36%, 100%, 85.71% for leukemia, colon, breast cancer, ovarian, and central nervous system, respectively, which overcome most of the state-of-the-art methods and also dominance on the baseline ensemble method with 12% enhancement.
引用
收藏
页码:13513 / 13528
页数:16
相关论文
共 57 条
  • [1] A TRIZ-inspired bat algorithm for gene selection in cancer classification
    Al-Betar, Mohammed Azmi
    Alomari, Osama Ahmad
    Abu-Romman, Saeid M.
    [J]. GENOMICS, 2020, 112 (01) : 114 - 126
  • [2] A novel gene selection algorithm for cancer classification using microarray datasets
    Alanni, Russul
    Hou, Jingyu
    Azzawi, Hasseeb
    Xiang, Yong
    [J]. BMC MEDICAL GENOMICS, 2019, 12 (1)
  • [3] Stable bagging feature selection on medical data
    Alelyani, Salem
    [J]. JOURNAL OF BIG DATA, 2021, 8 (01)
  • [4] Ali, 2007, INT J ADV SOFT COMPU, V7, P176
  • [5] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [6] Alrefai Nashat., 2019, International Journal of Applied Engineering Research, V14, Number, P4077
  • [7] [Anonymous], 2016, MACH LEARN APPL INT, DOI DOI 10.5121/MLAIJ.2016.3201
  • [8] Ensembles for feature selection: A review and future trends
    Bolon-Canedo, Veronica
    Alonso-Betanzos, Amparo
    [J]. INFORMATION FUSION, 2019, 52 : 1 - 12
  • [9] Brodley C., 1996, AAAI 96 WORKSHOP INT, P8
  • [10] Combining diversity measures for ensemble pruning
    Cavalcanti, George D. C.
    Oliveira, Luiz S.
    Moura, Thiago J. M.
    Carvalho, Guilherme V.
    [J]. PATTERN RECOGNITION LETTERS, 2016, 74 : 38 - 45