A Novel Hybrid Feature Selection Model for Classification of Neuromuscular Dystrophies Using Bhattacharyya Coefficient, Genetic Algorithm and Radial Basis Function Based Support Vector Machine

被引:2
作者
Anand, Divya [1 ]
Pandey, Babita [2 ]
Pandey, Devendra K. [3 ]
机构
[1] Lovely Profess Univ, Sch Comp Sci & Engn, Chaheru, Punjab, India
[2] Lovely Profess Univ, Sch Comp Applicat, Chaheru, Punjab, India
[3] Lovely Profess Univ, Sch Biosci, Chaheru, Punjab, India
关键词
Bhattacharyya coefficient; Genetic algorithm; Support vector machine; Neuromuscular disorders; Microarray data; Radial basis function; CANCER CLASSIFICATION; DIAGNOSIS; SVM; SYSTEM;
D O I
10.1007/s12539-016-0183-6
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
An accurate classification of neuromuscular disorders is important in providing proper treatment facilities to the patients. Recently, the microarray technology is employed to monitor the level of activity or expression of large number of genes simultaneously. The gene expression data derived from the microarray experiment usually involve a large number of genes but a very few number of samples. There is a need to reduce the dimension of gene expression data which intends to find a small set of discriminative genes that accurately classifies the samples of various kinds of diseases. So, our goal is to find a small subset of genes which ensures the accurate classification of neuromuscular disorders. In the present paper, we propose a novel hybrid feature selection model for classification of neuromuscular disorders. The process of feature selection is done in two phases by integrating Bhattacharyya coefficient and genetic algorithm (GA). In the first phase, we find Bhattacharyya coefficient to choose a candidate gene subset by removing the most redundant genes. In the second phase, the target gene subset is created by selecting the most discriminative gene subset by applying GA wherein the fitness function is calculated using radial basis function support vector machine (RBF SVM). The proposed hybrid algorithm is applied on two publicly available microarray neuromuscular disorders datasets. The results are compared with two individual techniques of feature selection, namely Bhattacharyya coefficient and GA, and one integrated technique, i.e., Bhattacharyya-GA wherein the fitness function of GA is calculated using four other classifiers, which shows that the proposed integrated method is capable of giving the better classification accuracy.
引用
收藏
页码:244 / 250
页数:7
相关论文
共 35 条
[11]   A Hybrid Automatic System for the Diagnosis of Lung Cancer Based on Genetic Algorithm and Fuzzy Extreme Learning Machines [J].
Daliri, Mohammad Reza .
JOURNAL OF MEDICAL SYSTEMS, 2012, 36 (02) :1001-1005
[13]   MicroCBR: A case-based reasoning architecture for the classification of microarray data [J].
De Paz, Juan F. ;
Bajo, Javier ;
Vera, Vicente ;
Corchado, Juan M. .
APPLIED SOFT COMPUTING, 2011, 11 (08) :4496-4507
[14]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[15]   Effective Classification and Gene Expression Profiling for the Facioscapulohumeral Muscular Dystrophy [J].
Gonzalez-Navarro, Felix F. ;
Belanche-Munoz, Lluis A. ;
Silva-Colon, Karen A. .
PLOS ONE, 2013, 8 (12)
[16]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[17]  
Hernandez JCH, 2007, LECT NOTES COMPUT SC, V4447, P90
[18]  
Hira Zena M., 2015, Advances in Bioinformatics, V2015, P198363, DOI 10.1155/2015/198363
[19]   A comparison of methods for multiclass support vector machines [J].
Hsu, CW ;
Lin, CJ .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2002, 13 (02) :415-425
[20]  
LIU B, 2004, BMC BIOINFORMATICS, V5, P1, DOI DOI 10.1186/1471-2105-5-136