Hybrid Method Based on Information Gain and Support Vector Machine for Gene Selection in Cancer Classification

被引:81
作者
Gao, Lingyun [1 ]
Ye, Mingquan [1 ]
Lu, Xiaojie [1 ]
Huang, Daobin [1 ]
机构
[1] Wannan Med Coll, Sch Med Informat, Wuhu 241002, Peoples R China
基金
中国国家自然科学基金;
关键词
Gene selection; Cancer classification; Information gain; Support vector machine; Small sample size with high dimension; CONGENITAL MUSCULAR-DYSTROPHY; HEPSIN GENE; EXPRESSION; OPTIMIZATION; MUTATIONS; ALGORITHM; VARIANTS; INPP5K;
D O I
10.1016/j.gpb.2017.08.002
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
It remains a great challenge to achieve sufficient cancer classification accuracy with the entire set of genes, due to the high dimensions, small sample size, and big noise of gene expression data. We thus proposed a hybrid gene selection method, Information Gain-Support Vector Machine (IG-SVM) in this study. IG was initially employed to filter irrelevant and redundant genes. Then, further removal of redundant genes was performed using SVM to eliminate the noise in the datasets more effectively. Finally, the informative genes selected by IG-SVM served as the input for the LIBSVM classifier. Compared to other related algorithms, IG-SVM showed the highest classification accuracy and superior performance as evaluated using five cancer gene expression datasets based on a few selected genes. As an example, IG-SVM achieved a classification accuracy of 90.32% for colon cancer, which is difficult to be accurately classified, only based on three genes including CSRP1, MYL9, and GUCA2B.
引用
收藏
页码:389 / 395
页数:7
相关论文
共 44 条
[31]   Detection of biomarkers for Hepatocellular Carcinoma using a hybrid univariate gene selection methods [J].
Samee, Nagwan M. Abdel ;
Solouma, Nahed H. ;
Kadah, Yasser M. .
THEORETICAL BIOLOGY AND MEDICAL MODELLING, 2012, 9
[32]   A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization [J].
Sharbaf, Fatemeh Vafaee ;
Mosafer, Sara ;
Moattar, Mohammad Hossein .
GENOMICS, 2016, 107 (06) :231-238
[33]   A novel aggregate gene selection method for microarray data classification [J].
Thanh Nguyen ;
Khosravi, Abbas ;
Creighton, Douglas ;
Nahavandi, Saeid .
PATTERN RECOGNITION LETTERS, 2015, 60-61 :16-23
[34]   Gene Expression Data Classification using Support Vector Machine and Mutual Information-based Gene Selection [J].
Vanitha, Devi Arockia C. ;
Devaraj, D. ;
Venkatesulu, M. .
GRAPH ALGORITHMS, HIGH PERFORMANCE IMPLEMENTATIONS AND ITS APPLICATIONS (ICGHIA 2014), 2015, 47 :13-21
[35]  
Vural H, 2015, Modeling of Artificial Intelligence, V6, P171, DOI 10.13187/mai.2015.6.171
[36]   Instantaneous simulation of fluids and particles in complex microfluidic devices [J].
Wang, Junchao ;
Rodgers, Victor G. J. ;
Brisk, Philip ;
Grover, William H. .
PLOS ONE, 2017, 12 (12)
[37]   Identification of lung cancer oncogenes based on the mRNA expression and single nucleotide polymorphism profile data [J].
Wang, Y. ;
Mei, Q. ;
Ai, Y. Q. ;
Li, R. Q. ;
Chang, L. ;
Li, Y. F. ;
Xia, Y. X. ;
Li, W. H. ;
Chen, Y. .
NEOPLASMA, 2015, 62 (06) :966-973
[38]   HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data [J].
Wang, YH ;
Makedon, FS ;
Ford, JC ;
Pearlman, J .
BIOINFORMATICS, 2005, 21 (08) :1530-1537
[39]  
Weng Howe Chan, 2016, International Journal of Bioinformatics Research and Applications, V12, P72
[40]   Mutations in INPPSK, Encoding a Phosphoinositide 5-Phosphatase, Cause Congenital Muscular Dystrophy with Cataracts and Mild Cognitive Impairment [J].
Wiessner, Manuela ;
Roos, Andreas ;
Munn, Christopher J. ;
Viswanathan, Ranjith ;
Whyte, Tamieka ;
Cox, Dan ;
Schoser, Benedikt ;
Sewry, Caroline ;
Roper, Helen ;
Phadke, Rahul ;
Bettolo, Chiara Marini ;
Barresi, Rita ;
Charlton, Richard ;
Bonnemann, Carsten G. ;
Neto, Osorio Abath ;
Reed, Umbertina C. ;
Zanoteli, Edmar ;
Moreno, Cristiane Araujo Martins ;
Ertl-Wagner, Birgit ;
Stucka, Rolf ;
De Goede, Christian ;
da Silva, Tamiris Borges ;
Hathazi, Denisa ;
Dell'Aica, Margherita ;
Zahedi, Rene P. ;
Thiele, Simone ;
Muller, Juliane ;
Kingston, Helen ;
Mueller, Susanna ;
Curtis, Elizabeth ;
Walter, Maggie C. ;
Strom, Tim M. ;
Straub, Volker ;
Bushby, Kate ;
Muntoni, Francesco ;
Swan, Laura E. ;
Lochmuller, Hanns ;
Senderek, Jan .
AMERICAN JOURNAL OF HUMAN GENETICS, 2017, 100 (03) :523-536