Microarray gene expression data have become a topic of great interest for cancer classification and for further research in the field of bioinformatics. Nonetheless, due to the "large p, small n" paradigm of limited biosamples and high-dimensional data, gene selection is becoming a demanding task, which is aimed at selecting a minimal number of discriminatory genes associated closely with a phenotype. Feature or gene selection is still a challenging problem owing to its nondeterministic polynomial time complexity and thus most of the existing feature selection algorithms utilize heuristic rules. A multilayer recursive feature elimination method based on an embedded integer-coded genetic algorithm, MGRFE, is proposed here, which is aimed at selecting the gene combination with minimal size and maximal information. On the basis of 19 benchmark microarray datasets including multiclass and imbalanced datasets, MGRFE outperforms state-of-the-art feature selection algorithms with better cancer classification accuracy and a smaller selected gene number. MGRFE could be regarded as a promising feature selection method for high-dimensional datasets especially gene expression data. Moreover, the genes selected by MGRFE have close biological relevance to cancer phenotypes. The source code of our proposed algorithm and all the 19 datasets used in this paper are available at https://github.com/Pengeace/MGRFE-GaRFE.
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Bhattacharyya, C
Grate, LR
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Grate, LR
Rizki, A
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Rizki, A
Radisky, D
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Radisky, D
Molina, FJ
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Molina, FJ
Jordan, MI
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Jordan, MI
Bissell, MJ
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Bissell, MJ
Mian, IS
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USAUniv Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Bhattacharyya, C
Grate, LR
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Grate, LR
Rizki, A
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Rizki, A
Radisky, D
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Radisky, D
Molina, FJ
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Molina, FJ
Jordan, MI
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Jordan, MI
Bissell, MJ
论文数: 0引用数: 0
h-index: 0
机构:Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA
Bissell, MJ
Mian, IS
论文数: 0引用数: 0
h-index: 0
机构:
Univ Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USAUniv Calif Berkeley, Lawrence Berkeley Natl Lab, Div Life Sci, Berkeley, CA 94720 USA