Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods

被引:0
作者
Manosij Ghosh
Sukdev Adhikary
Kushal Kanti Ghosh
Aritra Sardar
Shemim Begum
Ram Sarkar
机构
[1] Jadavpur University,Department of Computer Science and Engineering
[2] Government College of Engineering & Textile Technology,Department of Computer Science and Engineering
来源
Medical & Biological Engineering & Computing | 2019年 / 57卷
关键词
Wrapper method; Filter method; Ensemble; Microarray data; Cancer detection;
D O I
暂无
中图分类号
学科分类号
摘要
Microarray datasets play a crucial role in cancer detection. But the high dimension of these datasets makes the classification challenging due to the presence of many irrelevant and redundant features. Hence, feature selection becomes irreplaceable in this field because of its ability to remove the unrequired features from the system. As the task of selecting the optimal number of features is an NP-hard problem, hence, some meta-heuristic search technique helps to cope up with this problem. In this paper, we propose a 2-stage model for feature selection in microarray datasets. The ranking of the genes for the different filter methods are quite diverse and effectiveness of rankings is datasets dependent. First, we develop an ensemble of filter methods by considering the union and intersection of the top-n features of ReliefF, chi-square, and symmetrical uncertainty. This ensemble allows us to combine all the information of the three rankings together in a subset. In the next stage, we use genetic algorithm (GA) on the union and intersection to get the fine-tuned results, and union performs better than the latter. Our model has been shown to be classifier independent through the use of three classifiers—multi-layer perceptron (MLP), support vector machine (SVM), and K-nearest neighbor (K-NN). We have tested our model on five cancer datasets—colon, lung, leukemia, SRBCT, and prostate. Experimental results illustrate the superiority of our model in comparison to state-of-the-art methods.
引用
收藏
页码:159 / 176
页数:17
相关论文
共 133 条
[1]  
Vaidya AR(2015)Neural mechanisms for undoing the “curse of dimensionality” J Neurosci 35 12083-12084
[2]  
Jain A(1997)Feature selection: evaluation, application, and small sample performance IEEE Trans Pattern Anal Mach Intell 19 153-158
[3]  
Zongker D(2003)An introduction to variable and feature selection J Mach Learn Res 3 1157-1182
[4]  
Guyon I(2002)Unsupervised feature selection using feature similarity IEEE Trans Pattern Anal Mach Intell 24 301-312
[5]  
Elisseeff A(2015)An advanced ACO algorithm for feature subset selection Neurocomputing 147 271-279
[6]  
Mitra P(2017)Metaheuristic approach for an enhanced mRMR filter method for classification using drug response microarray data Expert Syst Appl 90 224-231
[7]  
Murthy CA(2011)A two-stage feature selection method for text categorization by using information gain, principal component analysis and genetic algorithm Knowl-Based Syst 24 1024-1032
[8]  
Pal SK(2004)Feature selection for text categorization on imbalanced data ACM Sigkdd Explor Newsl 6 80-89
[9]  
Kashef S(2007)A review of feature selection techniques in bioinformatics Bioinformatics 23 2507-2517
[10]  
Nezamabadi-pour H(1997)Feature selection for classification Intell Data Anal 1 131-156