Two-stage gene selection for support vector machine classification of microarray data

被引:7
作者
Xia, Xiao-Lei [1 ]
Li, Kang [1 ]
Irwin, George W. [1 ]
机构
[1] Queens Univ Belfast, Sch Elect Elect Engn & Comp Sci, Ashby Bldg,Stranmillis Rd, Belfast BT9 5AH, Antrim, North Ireland
关键词
support vector machines; SVM; two-stage linear regression; gene selection; baseline method; significance analysis of microarrays; SAM;
D O I
10.1504/IJMIC.2009.029029
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a new stable gene selection method for support vector machines (SVM) classification of microarray data, aiming to improve the classification accuracy. A two-stage algorithm is used to select genes, leading to the construction of a compact multivariate linear regression model, which contains only genes less than the number of experiments as well as a weight vector for each gene index. An SVM then learns the microarray data based on this linear regression model. The experimental results, from two well-known microarray datasets, show that SVMs with two-stage gene selection maintains a consistently high accuracy with a small number of genes. It is also shown that the proposed method outperforms the two other typical gene selection methods - baseline method and significance analysis of microarrays in terms of accuracy.
引用
收藏
页码:164 / 171
页数:8
相关论文
共 10 条
[1]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[2]  
Chih-Chung C., 2001, LIBSVM LIB SUPPORT V
[3]  
Fletcher R., 1987, PRACTICAL METHODS OP, V2nd ed.
[4]   Support vector machine classification and validation of cancer tissue samples using microarray expression data [J].
Furey, TS ;
Cristianini, N ;
Duffy, N ;
Bednarski, DW ;
Schummer, M ;
Haussler, D .
BIOINFORMATICS, 2000, 16 (10) :906-914
[5]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[6]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[7]   A two-stage algorithm for identification of nonlinear dynamic systems [J].
Li, Kang ;
Peng, Jian-Xun ;
Bai, Er-Wei .
AUTOMATICA, 2006, 42 (07) :1189-1197
[8]  
MUKHERJEE S, 1998, 1677 AI
[9]   Significance analysis of microarrays applied to the ionizing radiation response [J].
Tusher, VG ;
Tibshirani, R ;
Chu, G .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (09) :5116-5121
[10]  
Vapnik V.N., 1998, STAT LEARNING THEORY, V1