A hybrid LDA and genetic algorithm for gene selection and classification of microarray data

被引:48
作者
Bonilla Huerta, Edmundo [2 ]
Duval, Beatrice [1 ]
Hao, Jin-Kao [1 ]
机构
[1] Univ Angers, LERIA, F-49045 Angers, France
[2] Inst Tecnol Apizaco, Apizaco 90300, Tlaxcala, Mexico
关键词
Gene selection; Classification; Dedicated genetic algorithm; Linear discriminant analysis; SUPPORT VECTOR MACHINE; CANCER CLASSIFICATION; DISCRIMINANT-ANALYSIS; EXPRESSION DATA; MOLECULAR CLASSIFICATION; TISSUE CLASSIFICATION; BIOMARKER DISCOVERY; PREDICTION; TUMOR; VALIDATION;
D O I
10.1016/j.neucom.2010.03.024
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In supervised classification of Microarray data, gene selection aims at identifying a (small) subset of informative genes from the initial data in order to obtain high predictive accuracy. This paper introduces a new embedded approach to this difficult task where a genetic algorithm (GA) is combined with Fisher's linear discriminant analysis (LDA). This LDA-based GA algorithm has the major characteristic that the GA uses not only a LDA classifier in its fitness function, but also LDA's discriminant coefficients in its dedicated crossover and mutation operators. Computational experiments on seven public datasets show that under an unbiased experimental protocol, the proposed algorithm is able to reach high prediction accuracies with a small number of selected genes. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:2375 / 2383
页数:9
相关论文
共 43 条
[21]   A robust hybrid between genetic algorithm and support vector machine for extracting an optimal feature gene subset [J].
Li, LB ;
Jiang, W ;
Li, X ;
Moser, KL ;
Guo, Z ;
Du, L ;
Wang, QJ ;
Topol, EJ ;
Wang, Q ;
Rao, S .
GENOMICS, 2005, 85 (01) :16-23
[22]   Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method [J].
Li, LP ;
Weinberg, CR ;
Darden, TA ;
Pedersen, LG .
BIOINFORMATICS, 2001, 17 (12) :1131-1142
[23]   Gene selection using genetic algorithm and support vectors machines [J].
Li, Shutao ;
Wu, Xixian ;
Hu, Xiaoyan .
SOFT COMPUTING, 2008, 12 (07) :693-698
[24]   A combinational feature selection and ensemble neural network method for classification of gene expression data [J].
Liu, B ;
Cui, QH ;
Jiang, TZ ;
Ma, SD .
BMC BIOINFORMATICS, 2004, 5 (1)
[25]   Multiclass cancer classification and biomarker discovery using GA-based algorithms [J].
Liu, JJ ;
Cutler, G ;
Li, WX ;
Pan, Z ;
Peng, SH ;
Hoey, T ;
Chen, LB ;
Ling, XFB .
BIOINFORMATICS, 2005, 21 (11) :2691-2697
[26]  
Marchiori E, 2005, LECT NOTES COMPUT SC, V3449, P74
[27]   Genetic algorithms applied to multi-class prediction for the analysis of gene expression data [J].
Ooi, CH ;
Tan, P .
BIOINFORMATICS, 2003, 19 (01) :37-44
[28]   Classification consistency analysis for bootstrapping gene selection [J].
Pang, Shaoning ;
Havukkala, Ilkka ;
Hu, Yingjie ;
Kasabov, Nikola .
NEURAL COMPUTING & APPLICATIONS, 2007, 16 (06) :527-539
[29]   A comparison of generalized linear discriminant analysis algorithms [J].
Park, Cheong Hee ;
Park, Haesun .
PATTERN RECOGNITION, 2008, 41 (03) :1083-1097
[30]   Molecular classification of cancer types from microarray data using the combination of genetic algorithms and support vector machines [J].
Peng, SH ;
Xu, QH ;
Ling, XB ;
Peng, XN ;
Du, W ;
Chen, LB .
FEBS LETTERS, 2003, 555 (02) :358-362