A support vector machine ensemble for cancer classification using gene expression data

被引:0
作者
Liao, Chen [1 ]
Li, Shutao [1 ]
机构
[1] Hunan Univ, Coll Elect & Informat Engn, Changsha 410082, Peoples R China
来源
BIOINFORMATICS RESEARCH AND APPLICATIONS, PROCEEDINGS | 2007年 / 4463卷
关键词
support vector machine; wilcoxon rank sum test; gene selection; ensemble classifier; classification accuracy;
D O I
暂无
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
In this paper, we propose a support vector machine (SVM) ensemble classification method. Firstly, dataset is preprocessed by Wilcoxon rank sum test to filter irrelevant genes. Then one SVM is trained using the training set, and is tested by the training set itself to get prediction results. Those samples with error prediction result or low confidence are selected to train the second SVM, and also the second SVM is tested again. Similarly, the third SVM is obtained using those samples, which cannot be correctly classified using the second SVM with large confidence. The three SVMs form SVM ensemble classifier. Finally, the testing set is fed into the ensemble classifier. The final test prediction results can be got by majority voting. Experiments are performed on two standard benchmark datasets: Breast Cancer, ALL/AML Leukemia. Experimental results demonstrate that the proposed method can reach the state-of-the-art performance on classification.
引用
收藏
页码:488 / +
页数:2
相关论文
共 10 条
[1]  
BENDOR A, 2000, P 4 ANN INT C COMP M, P54
[2]  
Cristianini N., 2000, Intelligent Data Analysis: An Introduction
[3]  
GOLUB T, MOL CLASSIFICATION C, V286, P531
[4]   Feature generation using genetic programming with application to fault classification [J].
Guo, H ;
Jack, LB ;
Nandi, AK .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2005, 35 (01) :89-99
[5]  
HUERTA EB, 2006, EVO WORKSH, P34
[6]  
Krishnapuram B., 2004, KERNEL METHODS COMPU
[7]   A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression [J].
Li, T ;
Zhang, CL ;
Ogihara, M .
BIOINFORMATICS, 2004, 20 (15) :2429-2437
[8]  
Park P J, 2001, Pac Symp Biocomput, P52
[9]   Gene selection from microarray data for cancer classification - a machine learning approach [J].
Wang, Y ;
Tetko, IV ;
Hall, MA ;
Frank, E ;
Facius, A ;
Mayer, KFX ;
Mewes, HW .
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2005, 29 (01) :37-46
[10]   Predicting the clinical status of human breast cancer by using gene expression profiles [J].
West, M ;
Blanchette, C ;
Dressman, H ;
Huang, E ;
Ishida, S ;
Spang, R ;
Zuzan, H ;
Olson, JA ;
Marks, JR ;
Nevins, JR .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (20) :11462-11467