SVM-RFE With MRMR Filter for Gene Selection

被引:246
作者
Mundra, Piyushkumar A. [1 ]
Rajapakse, Jagath C. [1 ,2 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, BioInformat Res Ctr, Singapore 637553, Singapore
[2] MIT, Dept Biol Engn, Cambridge, MA 02139 USA
关键词
Cancer classification; gene redundancy; gene relevancy; mutual information; support vector machine recursive feature elimination (SVM-RFE); MICROARRAY DATA; CANCER CLASSIFICATION; MUTUAL INFORMATION; EXPRESSION; RELEVANCE; REDUNDANCY; PREDICTION; FEATURES;
D O I
10.1109/TNB.2009.2035284
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
We enhance the support vector machine recursive feature elimination (SVM-RFE) method for gene selection by incorporating a minimum-redundancy maximum-relevancy (MRMR) filter. The relevancy of a set of genes are measured by the mutual information among genes and class labels, and the redundancy is given by the mutual information among the genes. The method improved identification of cancer tissues from benign tissues on several benchmark datasets, as it takes into account the redundancy among the genes during their selection. The method selected a less number of genes compared to MRMR or SVM-RFE on most datasets. Gene ontology analyses revealed that the method selected genes that are relevant for distinguishing cancerous samples and have similar functional properties. The method provides a framework for combining filter methods and wrapper methods of gene selection, as illustrated with MRMR and SVM-RFE methods.
引用
收藏
页码:31 / 37
页数:7
相关论文
共 38 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]   Identifying genes that contribute most to good classification in microarrays [J].
Baker, Stuart G. ;
Kramer, Barnett S. .
BMC BIOINFORMATICS, 2006, 7 (1)
[3]   USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING [J].
BATTITI, R .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04) :537-550
[4]   Selection of relevant features and examples in machine learning [J].
Blum, AL ;
Langley, P .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :245-271
[5]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[6]   Gene selection and classification of microarray data using random forest -: art. no. 3 [J].
Díaz-Uriarte, R ;
de Andrés, SA .
BMC BIOINFORMATICS, 2006, 7 (1)
[7]   Minimum redundancy feature selection from microarray gene expression data [J].
Ding, C ;
Peng, HC .
PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, :523-528
[8]   Multiple SVM-RFE for gene selection in cancer classification with expression data [J].
Duan, KB ;
Rajapakse, JC ;
Wang, HY ;
Azuaje, F .
IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2005, 4 (03) :228-234
[9]  
Friedman J., 2001, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, V1
[10]   GOSim -: an R-package for computation of information theoretic GO similarities between terms and gene products [J].
Frohlich, Holger ;
Speer, Nora ;
Poustka, Annemarie ;
Beissarth, Tim .
BMC BIOINFORMATICS, 2007, 8 (1)