A hybrid LDA and genetic algorithm for gene selection and classification of microarray data

被引:48
作者
Bonilla Huerta, Edmundo [2 ]
Duval, Beatrice [1 ]
Hao, Jin-Kao [1 ]
机构
[1] Univ Angers, LERIA, F-49045 Angers, France
[2] Inst Tecnol Apizaco, Apizaco 90300, Tlaxcala, Mexico
关键词
Gene selection; Classification; Dedicated genetic algorithm; Linear discriminant analysis; SUPPORT VECTOR MACHINE; CANCER CLASSIFICATION; DISCRIMINANT-ANALYSIS; EXPRESSION DATA; MOLECULAR CLASSIFICATION; TISSUE CLASSIFICATION; BIOMARKER DISCOVERY; PREDICTION; TUMOR; VALIDATION;
D O I
10.1016/j.neucom.2010.03.024
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In supervised classification of Microarray data, gene selection aims at identifying a (small) subset of informative genes from the initial data in order to obtain high predictive accuracy. This paper introduces a new embedded approach to this difficult task where a genetic algorithm (GA) is combined with Fisher's linear discriminant analysis (LDA). This LDA-based GA algorithm has the major characteristic that the GA uses not only a LDA classifier in its fitness function, but also LDA's discriminant coefficients in its dedicated crossover and mutation operators. Computational experiments on seven public datasets show that under an unbiased experimental protocol, the proposed algorithm is able to reach high prediction accuracies with a small number of selected genes. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:2375 / 2383
页数:9
相关论文
共 43 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]   Microarray data analysis: from disarray to consolidation and consensus [J].
Allison, DB ;
Cui, XQ ;
Page, GP ;
Sabripour, M .
NATURE REVIEWS GENETICS, 2006, 7 (01) :55-65
[3]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[4]   Selection bias in gene extraction on the basis of microarray gene-expression data [J].
Ambroise, C ;
McLachlan, GJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (10) :6562-6566
[5]   Tissue classification with gene expression profiles [J].
Ben-Dor, A ;
Bruhn, L ;
Friedman, N ;
Nachman, I ;
Schummer, M ;
Yakhini, Z .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2000, 7 (3-4) :559-583
[6]   Is cross-validation valid for small-sample microarray classification? [J].
Braga-Neto, UM ;
Dougherty, ER .
BIOINFORMATICS, 2004, 20 (03) :374-380
[7]   Cancer classification using ensemble of neural networks with multiple significant gene subset's [J].
Cho, Sung-Bae ;
Won, Hong-Hee .
APPLIED INTELLIGENCE, 2007, 26 (03) :243-250
[8]  
Ding Chris, 2005, Journal of Bioinformatics and Computational Biology, V3, P185, DOI 10.1142/S0219720005001004
[9]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87