Distributed feature selection (DFS) strategy for microarray gene expression data to improve the classification performance

被引:24
|
作者
Potharaju, Sai Prasad [1 ]
Sreedevi, M. [1 ]
机构
[1] KL Univ, Dept CSE, Guntur, AP, India
来源
关键词
Microarray; Feature selection; Classification; High dimensionality;
D O I
10.1016/j.cegh.2018.04.001
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Objective: The objective of this research article is to present a novel feature selection strategy for improving the classification performance over high dimensional data sets. Curse of dimensionality is the most serious downside of microarray data as it has more number of genes(features). This leads to discouraged computational stability. In microarray data analytics, identifying more relevant features required full attention. Most of the researchers applied two stage strategy for gene expression data analysis. In first stage, feature selection or feature extraction is employed as a preprocessing step to pinpoint more prominent features. In second stage, classification is applied using selected subset of features. Method: In this research also we followed the same strategy. But, we tried to introduce a distributed feature selection(dfs) strategy using Symmetrical Uncertainty(SU) and Multi Layer Perceptron(MLP) by distributing across the multiple clusters. Each cluster is equipped with finite number of features in it. MLP is employed over each cluster, and based on the highest accuracy and lowest Root Mean Square error rate(RMS) dominant cluster is nominated. Result: Classification accuracy with Ridor, Simple Cart (SC), KNN, SVM are measured by considering dominant cluster's features. The performance of this cluster is compared with the traditional filter based ranking techniques like Information Gain(IG), Gain Ratio Attribute Evaluator(GRAE), Chi-Squared Attribute Evaluator (Chi). The proposed method is recorded approximately 57% success rate, 18% competitive rate against traditional methods after applying it over 7 well high dimensional and one lower dimension dataset. Conclusion: The proposed methodology applied over very high dimensional microarry datasets. Using this method memory consumption will be reduced and classification performance can be improved.
引用
收藏
页码:171 / 176
页数:6
相关论文
共 50 条
  • [1] Distributed feature selection: An application to microarray data classification
    Bolon-Canedo, V.
    Sanchez-Marono, N.
    Alonso-Betanzos, A.
    APPLIED SOFT COMPUTING, 2015, 30 : 136 - 150
  • [2] A blocking strategy to improve gene selection for classification of gene expression data
    Bontempi, Gianluca
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2007, 4 (02) : 293 - 300
  • [3] Combination of Feature Selection Methods for the Effective Classification of Microarray Gene Expression Data
    Sheela, T.
    Rangarajan, Lalitha
    RECENT TRENDS IN IMAGE PROCESSING AND PATTERN RECOGNITION (RTIP2R 2016), 2017, 709 : 137 - 145
  • [4] Feature Selection for Cancer Classification on Microarray Expression Data
    Hsu, Hui-Huang
    Lu, Ming-Da
    ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 3, PROCEEDINGS, 2008, : 153 - 158
  • [5] A New hybrid Feature selection-Classification model to Improve Cancer Sample Classification Accuracy in Microarray Gene Expression Data
    Bandyopadhyay, Ritaban
    Sharma, Arijt Das
    Dasgupta, Bidya
    Ghosh, Ankita
    Das, Chandra
    Bose, Shilpi
    2023 INTERNATIONAL CONFERENCE ON COMPUTER, ELECTRICAL & COMMUNICATION ENGINEERING, ICCECE, 2023,
  • [6] A Survey on Hybrid Feature Selection Methods in Microarray Gene Expression Data for Cancer Classification
    Almugren, Nada
    Alshamlan, Hala
    IEEE ACCESS, 2019, 7 : 78533 - 78548
  • [7] Analysis of Microarray Gene Expression Data Using Various Feature Selection and Classification Techniques
    Singh, W. Jai
    Kavitha, R. K.
    BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (11): : 105 - 108
  • [8] Feature selection methods on gene expression microarray data for cancer classification: A systematic review
    Alhenawi, Esra'a
    Al-Sayyed, Rizik
    Hudaib, Amjad
    Mirjalili, Seyedali
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 140
  • [9] A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data
    Wang, Hong
    Jing, Xingjian
    Niu, Ben
    KNOWLEDGE-BASED SYSTEMS, 2017, 126 : 8 - 19
  • [10] An Approach Based on Resampling and Feature Selection to Improve the Classification of Microarray Data
    Soleymani, Nafiseh
    Moattar, Mohammad Hussein
    2018 6TH IRANIAN JOINT CONGRESS ON FUZZY AND INTELLIGENT SYSTEMS (CFIS), 2018, : 61 - 64