A Stable Gene Subset Selection Algorithm for Cancers

被引:3
作者
Xie, Juanying [1 ]
Gao, Hongchao [1 ]
机构
[1] Shaanxi Normal Univ, Sch Comp Sci, Xian 710062, Peoples R China
来源
HEALTH INFORMATION SCIENCE (HIS 2015) | 2015年 / 9085卷
关键词
Gene selection; Gene subsets; K-means; Assemble; Pearson correlation coefficient; Cancers; CLASSIFICATION;
D O I
10.1007/978-3-319-19156-0_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to solve the problem that the selected genes are depend on the train subset in the gene subset selection algorithms, we propose an assemble method to select the discrimination genes for cancers, so that a stable gene subset can be obtained. We randomly extract some proportional samples from train subset and cluster the genes of these samples in K-means, then select a typical gene from each cluster according to its weight estimated in Pearson correlation coefficient between genes and labels. This process is repeated several times. Those genes with high frequencies in the processes are selected to construct the selected gene subset. The power of the proposed method is tested on three very popular gene datasets, and the experimental results demonstrate that the new algorithm proposed in this paper has found the most stable gene subset with the highest classification accuracy.
引用
收藏
页码:111 / 122
页数:12
相关论文
共 19 条
[11]   Extensions to the k-means algorithm for clustering large data sets with categorical values [J].
Huang, ZX .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (03) :283-304
[12]  
Macqueen J., 1967, P 5 BERKELEY S MATH, VVolume 233, P281, DOI DOI 10.1007/S11665-016-2173-6
[13]  
Notterman DA, 2001, CANCER RES, V61, P3124
[14]  
Vapnik V.N., 2000, The nature of statistical learning theory
[15]  
Xie Juan-Ping, 2014, Journal of Software, V25, P2050, DOI 10.13328/j.cnki.jos.004644
[16]  
Xinguo Lu, 2012, 2012 IEEE 6th International Conference on Systems Biology (ISB), P226, DOI 10.1109/ISB.2012.6314141
[17]  
Yu L, 2008, P 14 ACM SIGKDD INT, P803, DOI DOI 10.1145/1401890.1401986
[18]   Stable Gene Selection from Microarray Data via Sample Weighting [J].
Yu, Lei ;
Han, Yue ;
Berens, Michael E. .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (01) :262-272
[19]   A variance reduction framework for stable feature selection [J].
Han, Yue ;
Yu, Lei .
Statistical Analysis and Data Mining, 2012, 5 (05) :428-445