An efficient statistical feature selection approach for classification of gene expression data

被引:110
|
作者
Chandra, B. [1 ]
Gupta, Manish [1 ]
机构
[1] Indian Inst Technol Delhi, Dept Math, New Delhi 110016, India
关键词
Cancer diagnosis and prediction; Gene selection; Classification; Feature selection; CANCER CLASSIFICATION; T-TEST; PREDICTION; TUMOR;
D O I
10.1016/j.jbi.2011.01.001
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Classification of gene expression data plays a significant role in prediction and diagnosis of diseases. Gene expression data has a special characteristic that there is a mismatch in gene dimension as opposed to sample dimension. All genes do not contribute for efficient classification of samples. A robust feature selection algorithm is required to identify the important genes which help in classifying the samples efficiently. In order to select informative genes (features) based on relevance and redundancy characteristics, many feature selection algorithms have been introduced in the past. Most of the earlier algorithms require computationally expensive search strategy to find an optimal feature subset. Existing feature selection methods are also sensitive to the evaluation measures. The paper introduces a novel and efficient feature selection approach based on statistically defined effective range of features for every class termed as ERGS (Effective Range based Gene Selection). The basic principle behind ERGS is that higher weight is given to the feature that discriminates the classes clearly. Experimental results on well-known gene expression datasets illustrate the effectiveness of the proposed approach. Two popular classifiers viz. Nave Bayes Classifier (NBC) and Support Vector Machine (SVM) have been used for classification. The proposed feature selection algorithm can be helpful in ranking the genes and also is capable of identifying the most relevant genes responsible for diseases like leukemia, colon tumor, lung cancer, diffuse large B-cell lymphoma (DLBCL), prostate cancer. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:529 / 535
页数:7
相关论文
共 50 条
  • [21] Recursive Cluster Elimination (RCE) for classification and feature selection from gene expression data
    Yousef, Malik
    Jung, Segun
    Showe, Louise C.
    Showe, Michael K.
    BMC BIOINFORMATICS, 2007, 8
  • [22] A Survey on Hybrid Feature Selection Methods in Microarray Gene Expression Data for Cancer Classification
    Almugren, Nada
    Alshamlan, Hala
    IEEE ACCESS, 2019, 7 : 78533 - 78548
  • [23] Improving the performance of principal components for classification of gene expression data through feature selection
    Acuna, Edgar
    Porras, Jaime
    DATA SCIENCE AND CLASSIFICATION, 2006, : 325 - +
  • [24] Analysis of Microarray Gene Expression Data Using Various Feature Selection and Classification Techniques
    Singh, W. Jai
    Kavitha, R. K.
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (11): : 105 - 108
  • [25] Feature selection methods on gene expression microarray data for cancer classification: A systematic review
    Alhenawi, Esra'a
    Al-Sayyed, Rizik
    Hudaib, Amjad
    Mirjalili, Seyedali
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 140
  • [26] A combinational feature selection and ensemble neural network method for classification of gene expression data
    Bing Liu
    Qinghua Cui
    Tianzi Jiang
    Songde Ma
    BMC Bioinformatics, 5
  • [27] Gene expression data classification using genetic algorithm-based feature selection
    Sonmez, Oznur Sinem
    Dagtekin, Mustafa
    Ensari, Tolga
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (07) : 3165 - 3179
  • [28] Recursive Cluster Elimination (RCE) for classification and feature selection from gene expression data
    Malik Yousef
    Segun Jung
    Louise C Showe
    Michael K Showe
    BMC Bioinformatics, 8
  • [29] A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data
    Wang, Hong
    Jing, Xingjian
    Niu, Ben
    KNOWLEDGE-BASED SYSTEMS, 2017, 126 : 8 - 19
  • [30] A combinational feature selection and ensemble neural network method for classification of gene expression data
    Liu, B
    Cui, QH
    Jiang, TZ
    Ma, SD
    BMC BIOINFORMATICS, 2004, 5 (1)