A Gene Selection Software Package for Cancer Classification

被引:0
作者
Peng, Sihua [2 ]
Liu, Xiaoping [3 ]
Yu, Jiyang [2 ]
Peng, Xiaoning [4 ]
Chen, Liangbiao [1 ]
机构
[1] Chinese Acad Sci, Inst Genet & Dev Biol, Beijing 100864, Peoples R China
[2] Zhejiang Univ, Sch Med, Dept Pathol, Hangzhou, Zhejiang, Peoples R China
[3] Xinjiang University, Coll Life Sci & Technol, Xinjiang, Peoples R China
[4] Hunan Normal Univ, Sch Med, Hunan, Peoples R China
来源
2009 WRI WORLD CONGRESS ON SOFTWARE ENGINEERING, VOL 2, PROCEEDINGS | 2009年
基金
国家高技术研究发展计划(863计划);
关键词
SUPPORT VECTOR MACHINES; MOLECULAR CLASSIFICATION; EXPRESSION SIGNATURES; MICROARRAY DATA; TUMOR; PREDICTION; PATTERNS;
D O I
10.1109/WCSE.2009.41
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Selecting a small number of relevant genes for accurate classification of samples is essential for the development of diagnostic tests, which have been the subject of considerable research in the past few years However, many researches have still been trying to improve the algorithms to obtain better results. Here we present a novel implementation of Recursive Feature Elimination method (nRFE) for gene selection and classification of microarray data. Our algorithm was evaluated over the NC160 benchmark datasets, with an accuracy of 96.6% in 10-fold cross-validation, respectively. Furthermore, the nRFE outperformed recently published algorithms when applied to another two multi-cancer data sets. Computational evidence indicated that nRFE can avoid overfitting effectively. The combination of high accuracy and small numbers of genes should make nRFE a powerful tool for gene selection from gene expression data
引用
收藏
页码:104 / +
页数:3
相关论文
共 20 条
[1]   Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling [J].
Alizadeh, AA ;
Eisen, MB ;
Davis, RE ;
Ma, C ;
Lossos, IS ;
Rosenwald, A ;
Boldrick, JG ;
Sabet, H ;
Tran, T ;
Yu, X ;
Powell, JI ;
Yang, LM ;
Marti, GE ;
Moore, T ;
Hudson, J ;
Lu, LS ;
Lewis, DB ;
Tibshirani, R ;
Sherlock, G ;
Chan, WC ;
Greiner, TC ;
Weisenburger, DD ;
Armitage, JO ;
Warnke, R ;
Levy, R ;
Wilson, W ;
Grever, MR ;
Byrd, JC ;
Botstein, D ;
Brown, PO ;
Staudt, LM .
NATURE, 2000, 403 (6769) :503-511
[2]   Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays [J].
Alon, U ;
Barkai, N ;
Notterman, DA ;
Gish, K ;
Ybarra, S ;
Mack, D ;
Levine, AJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) :6745-6750
[3]  
CAI ZP, BMC BIOINFORMATICS, P206
[4]   Support vector machine classification and validation of cancer tissue samples using microarray expression data [J].
Furey, TS ;
Cristianini, N ;
Duffy, N ;
Bednarski, DW ;
Schummer, M ;
Haussler, D .
BIOINFORMATICS, 2000, 16 (10) :906-914
[5]   Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring [J].
Golub, TR ;
Slonim, DK ;
Tamayo, P ;
Huard, C ;
Gaasenbeek, M ;
Mesirov, JP ;
Coller, H ;
Loh, ML ;
Downing, JR ;
Caligiuri, MA ;
Bloomfield, CD ;
Lander, ES .
SCIENCE, 1999, 286 (5439) :531-537
[6]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[7]   Wrappers for feature subset selection [J].
Kohavi, R ;
John, GH .
ARTIFICIAL INTELLIGENCE, 1997, 97 (1-2) :273-324
[8]   A new framework for identifying differentially expressed genes [J].
Li, Jie ;
Tang, Xianglong ;
Zhao, Wei ;
Huang, Jianhua .
PATTERN RECOGNITION, 2007, 40 (11) :3249-3262
[9]   A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression [J].
Li, T ;
Zhang, CL ;
Ogihara, M .
BIOINFORMATICS, 2004, 20 (15) :2429-2437
[10]   Pattern classification in DNA microarray data of multiple tumor types [J].
Lin, Tsun-Chen ;
Liu, Ru-Sheng ;
Chen, Chien-Yu ;
Chao, Ya-Ting ;
Chen, Shu-Yuan .
PATTERN RECOGNITION, 2006, 39 (12) :2426-2438