FCM-SVM-RFE gene feature selection algorithm for leukemia classification from microarray gene expression data

被引:0
|
作者
Tang, YC [1 ]
Zhang, YQ [1 ]
Huang, Z [1 ]
机构
[1] Georgia State Univ, Dept Comp Sci, Atlanta, GA 30302 USA
关键词
microarray gene expression data analysis; gene selection; support vector machines; recursive feature elimination; fuzzy C-means clustering; ACUTE MYELOID-LEUKEMIA; CANCER; TUMOR;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Selecting the most possibly cancer-related genes from huge microarray gene expression data is an important bioinformatics research topic due to its significance to improve human's understandability of the inherent cancer-resulting mechanism. This is actually a feature selection problem. The huge number of genes makes it impossible to execute an exhaustive search. In this work, we propose a Recursive Feature Elimination (RFE) algorithm named FCM-SVM-RFE for the gene selection task. In each step, similar genes are grouped into clusters by the Fuzzy C-Means clustering algorithm, and then a Support Vector Machine (SVM) is modeled in each cluster-induced space, the genes which contribute large to the margin width of the SVM are selected to survive to the next step. This process is repeated until a pre-specified number of genes are selected. FCM-SVM-RFE is compared with SVM-RFE on AML/ALL microarray gene expression data. The experimental results show that FCM-SVM-RFE is more accurate than SVM-RFE to predict the unknown samples. More importantly, FCM-SVM-RFE can find some compact subsets of genes on each of which a SVM with perfect prediction accuracy can be modeled. These "most informative genes" are very helpful for biologists to efficiently and effectively find the inherent cancer-resulting mechanism.
引用
收藏
页码:97 / 101
页数:5
相关论文
共 50 条
  • [21] Analysis of Microarray Gene Expression Data Using Various Feature Selection and Classification Techniques
    Singh, W. Jai
    Kavitha, R. K.
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (11): : 105 - 108
  • [22] Feature selection methods on gene expression microarray data for cancer classification: A systematic review
    Alhenawi, Esra'a
    Al-Sayyed, Rizik
    Hudaib, Amjad
    Mirjalili, Seyedali
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 140
  • [23] Gene Subset Selection for Leukemia Classification Using Microarray Data
    Fajila, Mohamed Nisper Fathima
    CURRENT BIOINFORMATICS, 2019, 14 (04) : 353 - 358
  • [24] Selection for feature gene subset in Microarray expression profiles based on a hybrid algorithm using SVM and GA
    Xiong, Wei
    Zhang, Chen
    Zhou, Chunguang
    Liang, Yanchun
    FRONTIERS OF HIGH PERFORMANCE COMPUTING AND NETWORKING - ISPA 2006 WORKSHOPS, PROCEEDINGS, 2006, 4331 : 637 - +
  • [25] Improving feature subset selection using a genetic algorithm for microarray gene expression data
    Tan, Feng
    Fu, Xuezheng
    Zhang, Yanqing
    Bourgeois, Anu G.
    2006 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-6, 2006, : 2514 - 2519
  • [26] Hybrid feature selection using micro genetic algorithm on microarray gene expression data
    Pragadeesh, C.
    Jeyaraj, Rohana
    Siranjeevi, K.
    Abishek, R.
    Jeyakumar, G.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2019, 36 (03) : 2241 - 2246
  • [27] A hybrid feature selection approach for microarray gene expression data
    Tan, Feng
    Fu, Xuezheng
    Wang, Hao
    Zhang, Yanqing
    Bourgeois, Anu
    COMPUTATIONAL SCIENCE - ICCS 2006, PT 2, PROCEEDINGS, 2006, 3992 : 678 - 685
  • [28] Quality of feature selection based on microarray gene expression data
    Maciejewski, Henryk
    COMPUTATIONAL SCIENCE - ICCS 2008, PT 3, 2008, 5103 : 140 - 147
  • [29] Microarray Cancer Gene Feature Selection Using Spider Monkey Optimization Algorithm and Cancer Classification using SVM
    Rani, R. Ranjani
    Ramyachitra, D.
    8TH INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING & COMMUNICATIONS (ICACC-2018), 2018, 143 : 108 - 116
  • [30] A Hybrid Feature Selection Based on Fisher Score and SVM-RFE for Microarray Data
    Hamla H.
    Ghanem K.
    Informatica (Slovenia), 2024, 48 (01): : 57 - 68