A New Filter Approach Based on Effective Ranges for Classification of Gene Expression Data

被引:0
作者
Turfan, Derya [1 ]
Altunkaynak, Bulent [2 ]
Yeniay, Ozgur [1 ]
机构
[1] Hacettepe Univ, Dept Stat, TR-06800 Ankara, Turkiye
[2] Gazi Univ, Dept Stat, Ankara, Turkiye
关键词
classification methods; effective range; feature selection; filter methods; gene expression data; FEATURE-SELECTION; CANCER CLASSIFICATION; SVM-RFE; INFORMATION; PREDICTION; ALGORITHM; PROFILE;
D O I
10.1089/big.2022.0086
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Over the years, many studies have been carried out to reduce and eliminate the effects of diseases on human health. Gene expression data sets play a critical role in diagnosing and treating diseases. These data sets consist of thousands of genes and a small number of sample sizes. This situation creates the curse of dimensionality and it becomes problematic to analyze such data sets. One of the most effective strategies to solve this problem is feature selection methods. Feature selection is a preprocessing step to improve classification performance by selecting the most relevant and informative features while increasing the accuracy of classification. In this article, we propose a new statistically based filter method for the feature selection approach named Effective Range-based Feature Selection Algorithm (FSAER). As an extension of the previous Effective Range based Gene Selection (ERGS) and Improved Feature Selection based on Effective Range (IFSER) algorithms, our novel method includes the advantages of both methods while taking into account the disjoint area. To illustrate the efficacy of the proposed algorithm, the experiments have been conducted on six benchmark gene expression data sets. The results of the FSAER and the other filter methods have been compared in terms of classification accuracies to demonstrate the effectiveness of the proposed method. For classification methods, support vector machines, naive Bayes classifier, and k-nearest neighbor algorithms have been used.
引用
收藏
页码:312 / 330
页数:19
相关论文
共 59 条
  • [1] ALMUALLIM H, 1991, PROCEEDINGS : NINTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOLS 1 AND 2, P547
  • [2] Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays
    Alon, U
    Barkai, N
    Notterman, DA
    Gish, K
    Ybarra, S
    Mack, D
    Levine, AJ
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (12) : 6745 - 6750
  • [3] MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia
    Armstrong, SA
    Staunton, JE
    Silverman, LB
    Pieters, R
    de Boer, ML
    Minden, MD
    Sallan, SE
    Lander, ES
    Golub, TR
    Korsmeyer, SJ
    [J]. NATURE GENETICS, 2002, 30 (01) : 41 - 47
  • [4] A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes
    Baldi, P
    Long, AD
    [J]. BIOINFORMATICS, 2001, 17 (06) : 509 - 519
  • [5] USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING
    BATTITI, R
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04): : 537 - 550
  • [6] Recent advances and emerging challenges of feature selection in the context of big data
    Bolon-Canedo, V.
    Sanchez-Marono, N.
    Alonso-Betanzos, A.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2015, 86 : 33 - 45
  • [7] A review of microarray datasets and applied feature selection methods
    Bolon-Canedo, V.
    Sanchez-Marono, N.
    Alonso-Betanzos, A.
    Benitez, J. M.
    Herrera, F.
    [J]. INFORMATION SCIENCES, 2014, 282 : 111 - 135
  • [8] Boser B. E., 1992, Proceedings of the Fifth Annual ACM Workshop on Computational Learning Theory, P144, DOI 10.1145/130385.130401
  • [9] An efficient gene selection algorithm based on mutual information
    Cai, Ruichu
    Hao, Zhifeng
    Yang, Xiaowei
    Wen, Wen
    [J]. NEUROCOMPUTING, 2009, 72 (4-6) : 991 - 999
  • [10] ITERATIVE FEATURE PERTURBATION AS A GENE SELECTOR FOR MICROARRAY DATA
    Canul-Reich, Juana
    Hall, Lawrence O.
    Goldgof, Dmitry B.
    Korecki, John N.
    Eschrich, Steven
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2012, 26 (05)