Gene selection with Game Shapley Harris hawks optimizer for cancer classification

被引:18
作者
Afreen, Sana [1 ]
Bhurjee, Ajay Kumar [1 ]
Aziz, Rabia Musheer [1 ]
机构
[1] VIT Bhopal Univ, Sehore 466114, Madhya Pradesh, India
关键词
Feature selection(FS); Kernel Shapley value (kSV); Harris hawks optimizer (HHO); Naive bayes (NB); K-nearest neighbors (KNN); Support vector machines (SVM); PREDICTION; ALGORITHM;
D O I
10.1016/j.chemolab.2023.104989
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cancer disease has been classified as a perilous disease for humans, being the second leading cause of death globally. Even advanced-stage diagnosis may not be effective in preventing patient mortality. Therefore, it is important to establish a sustainable framework that predicts reliable estimates for an early cancer diagnosis. In this paper, a new two-phase feature (gene) selection approach is presented. In the first phase, the kernel Shapley value (kSV) that is based on the cooperative game-theoretic feature extraction approach is utilized to extract the important feature from the high dimensional gene expression data. In the second phase, Harris hawks optimizer (HHO) algorithm is utilized to further optimize the most effective feature extracted by kSV. Next, to evaluate the effectiveness of our proposed algorithm, we conduct extensive experiments on eight benchmark high-dimensional gene expression datasets, comparing them with other state-of-the-art techniques. We employ three classifiers, namely support vector machines (SVM), Naive Bayes (NB), and K -nearest neighbors (KNN), to assess the selected genes efficacy and their impact on classification accuracy. The experimental results demonstrate that the proposed method, particularly when combined with the SVM classifier, outperforms other gene selection methods. The evaluation metrics, including accuracy, precision, recall, F1-score, ROC-AUC, box plot, and radar plot, consistently indicate the superiority of kSV-HHO across all tested datasets. Moreover, the comparative and statistical analysis reveals that our proposed method excels in identifying the most relevant features for cancer diagnosis compared to other gene selection approaches. This makes our framework a valuable tool for cancer research and clinical practice, potentially enhancing the accuracy of early cancer diagnosis using high-dimensional gene expression biomedical data.
引用
收藏
页数:19
相关论文
共 57 条
[1]   Explaining individual predictions when features are dependent: More accurate approximations to Shapley values [J].
Aas, Kjersti ;
Jullum, Martin ;
Loland, Anders .
ARTIFICIAL INTELLIGENCE, 2021, 298
[2]   A two-phase gene selection method using anomaly detection and genetic algorithm for microarray data [J].
Akhavan, Motahare ;
Hasheminejad, Seyed Mohammad Hossein .
KNOWLEDGE-BASED SYSTEMS, 2023, 262
[3]   Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review [J].
Alharbi, Fadi ;
Vakanski, Aleksandar .
BIOENGINEERING-BASEL, 2023, 10 (02)
[4]   Gene selection for microarray data classification based on Gray Wolf Optimizer enhanced with TRIZ-inspired operators [J].
Alomari, Osama Ahmad ;
Makhadmeh, Sharif Naser ;
Al-Betar, Mohammed Azmi ;
Alyasseri, Zaid Abdi Alkareem ;
Abu Doush, Iyad ;
Abasi, Ammar Kamal ;
Awadallah, Mohammed A. ;
Abu Zitar, Raed .
KNOWLEDGE-BASED SYSTEMS, 2021, 223
[5]   MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia [J].
Armstrong, SA ;
Staunton, JE ;
Silverman, LB ;
Pieters, R ;
de Boer, ML ;
Minden, MD ;
Sallan, SE ;
Lander, ES ;
Golub, TR ;
Korsmeyer, SJ .
NATURE GENETICS, 2002, 30 (01) :41-47
[6]   An enhanced binary Rat Swarm Optimizer based on local-best concepts of PSO and collaborative crossover operators for feature selection [J].
Awadallah, Mohammed A. ;
Al-Betar, Mohammed Azmi ;
Braik, Malik Shehadeh ;
Hammouri, Abdelaziz, I ;
Abu Doush, Iyad ;
Abu Zitar, Raed .
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 147
[7]   Binary Horse herd optimization algorithm with crossover operators for feature selection [J].
Awadallah, Mohammed A. ;
Hammouri, Abdelaziz, I ;
Al-Betar, Mohammed Azmi ;
Braik, Malik Shehadeh ;
Abd Elaziz, Mohamed .
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 141
[8]   Machine Learning Algorithms for Crime Prediction under Indian Penal Code [J].
Aziz R.M. ;
Sharma P. ;
Hussain A. .
Annals of Data Science, 2024, 11 (01) :379-410
[9]   Application of nature inspired soft computing techniques for gene selection: a novel frame work for classification of cancer [J].
Aziz, Rabia Musheer .
SOFT COMPUTING, 2022, 26 (22) :12179-12196
[10]  
Aziz RM, 2023, Book. Comput. Anal. Methods Biol. Sci., P23, DOI DOI 10.1201/9781003393238-2