Semi-supervised SVM-based Feature Selection for Cancer Classification using Microarray Gene Expression Data

被引:32
作者
Ang, Jun Chin [1 ]
Haron, Habibollah [1 ]
Hamed, Haza Nuzly Abdull [1 ]
机构
[1] Univ Teknol Malaysia, Fac Comp, Dept Comp Sci, Skudai, Johor, Malaysia
来源
CURRENT APPROACHES IN APPLIED ARTIFICIAL INTELLIGENCE | 2015年 / 9101卷
关键词
Support vector machines; Semi-supervised; Feature selection; Cancer; Gene expression; SUPPORT;
D O I
10.1007/978-3-319-19066-2_45
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gene expression data always suffer from the high dimensionality issue, therefore feature selection becomes a fundamental tool in the analysis of cancer classification. Basically, the data can be collected easily without providing the label information, which is quite useful in improving the accuracy of the classification. Label information usually difficult to obtain as the labelling processes are tedious, costly and error prone. Previous studies of gene selection are mostly dedicated to supervised and unsupervised approaches. Support vector machine (SVM) is a common supervised technique to address gene selection and cancer classification problems. Hence, this paper aims to propose a semi-supervised SVM-based feature selection ((SVM)-V-3-FS), which simultaneously exploit the knowledge from unlabelled and labelled data. Experimental results on the gene expression data of lung cancer show that (SVM)-V-3-FS achieves the higher accuracy yet requires shorter processing time compares with the well-known supervised method, SVM-based recursive feature elimination (SVM-RFE) and the improved method, (SVM)-V-3-RFE.
引用
收藏
页码:468 / 477
页数:10
相关论文
共 20 条
[1]  
Barkia H., 2011, Proceedings of the 2011 IEEE 11th International Conference on Data Mining (ICDM 2011), P31, DOI 10.1109/ICDM.2011.129
[2]  
Benabdeslem K., 2013, IEEE T KNOWL DATA EN, V1
[3]   Prognostic gene expression signatures can be measured in tissues collected in RNAlater preservative [J].
Chowdary, D ;
Lathrop, J ;
Skelton, J ;
Curtin, K ;
Briggs, T ;
Zhang, Y ;
Yu, J ;
Wang, YX ;
Mazumder, A .
JOURNAL OF MOLECULAR DIAGNOSTICS, 2006, 8 (01) :31-39
[4]   A novel multi-stage feature selection method for microarray expression data analysis [J].
Du, Wei ;
Sun, Ying ;
Wang, Yan ;
Cao, Zhongbo ;
Zhang, Chen ;
Liang, Yanchun .
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2013, 7 (01) :58-77
[5]  
Gaafar MA, 2012, IEEE INT C BIOINF BI, P368, DOI 10.1109/BIBE.2012.6399652
[6]  
Gordon GJ, 2002, CANCER RES, V62, P4963
[7]   Gene selection for cancer classification using support vector machines [J].
Guyon, I ;
Weston, J ;
Barnhill, S ;
Vapnik, V .
MACHINE LEARNING, 2002, 46 (1-3) :389-422
[8]  
Helleputte T., 2009, Proceedings of the 26th Annual International Conference on Machine Learning, P409
[9]   Constraint scores for semi-supervised feature selection: A comparative study [J].
Kalakech, Mariam ;
Biela, Philippe ;
Macaire, Ludovic ;
Hamad, Denis .
PATTERN RECOGNITION LETTERS, 2011, 32 (05) :656-665
[10]  
Kotsiantis SB, 2007, FRONT ARTIF INTEL AP, V160, P3