Distributed feature selection (DFS) strategy for microarray gene expression data to improve the classification performance

被引:24
|
作者
Potharaju, Sai Prasad [1 ]
Sreedevi, M. [1 ]
机构
[1] KL Univ, Dept CSE, Guntur, AP, India
来源
关键词
Microarray; Feature selection; Classification; High dimensionality;
D O I
10.1016/j.cegh.2018.04.001
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Objective: The objective of this research article is to present a novel feature selection strategy for improving the classification performance over high dimensional data sets. Curse of dimensionality is the most serious downside of microarray data as it has more number of genes(features). This leads to discouraged computational stability. In microarray data analytics, identifying more relevant features required full attention. Most of the researchers applied two stage strategy for gene expression data analysis. In first stage, feature selection or feature extraction is employed as a preprocessing step to pinpoint more prominent features. In second stage, classification is applied using selected subset of features. Method: In this research also we followed the same strategy. But, we tried to introduce a distributed feature selection(dfs) strategy using Symmetrical Uncertainty(SU) and Multi Layer Perceptron(MLP) by distributing across the multiple clusters. Each cluster is equipped with finite number of features in it. MLP is employed over each cluster, and based on the highest accuracy and lowest Root Mean Square error rate(RMS) dominant cluster is nominated. Result: Classification accuracy with Ridor, Simple Cart (SC), KNN, SVM are measured by considering dominant cluster's features. The performance of this cluster is compared with the traditional filter based ranking techniques like Information Gain(IG), Gain Ratio Attribute Evaluator(GRAE), Chi-Squared Attribute Evaluator (Chi). The proposed method is recorded approximately 57% success rate, 18% competitive rate against traditional methods after applying it over 7 well high dimensional and one lower dimension dataset. Conclusion: The proposed methodology applied over very high dimensional microarry datasets. Using this method memory consumption will be reduced and classification performance can be improved.
引用
收藏
页码:171 / 176
页数:6
相关论文
共 50 条
  • [21] Improving the performance of principal components for classification of gene expression data through feature selection
    Acuna, Edgar
    Porras, Jaime
    DATA SCIENCE AND CLASSIFICATION, 2006, : 325 - +
  • [22] Gene selection for tumor classification using microarray gone expression data
    Yendrapalli, K.
    Basnet, R.
    Mukkamala, S.
    Sung, A. H.
    WORLD CONGRESS ON ENGINEERING 2007, VOLS 1 AND 2, 2007, : 290 - +
  • [23] A hybrid feature selection algorithm for gene expression data classification
    Lu, Huijuan
    Chen, Junying
    Yan, Ke
    Jin, Qun
    Xue, Yu
    Gao, Zhigang
    NEUROCOMPUTING, 2017, 256 : 56 - 62
  • [24] Feature selection as a preprocessing step for classification in gene expression data
    Borges, Helyane Bronoski
    Nievola, Julio Cesar
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, 2007, : 157 - +
  • [25] Review on Feature Selection Methods for Gene Expression Data Classification
    Almutiri, Talal
    Saeed, Faisal
    EMERGING TRENDS IN INTELLIGENT COMPUTING AND INFORMATICS: DATA SCIENCE, INTELLIGENT INFORMATION SYSTEMS AND SMART COMPUTING, 2020, 1073 : 24 - 34
  • [26] Feature Selection of Gene Expression Data for Cancer Classification: A Review
    Singh, Rabindra Kumar
    Sivabalakrishnan, M.
    BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 : 52 - 57
  • [27] Feature Selection and Classification of MAQC-II Breast Cancer and Multiple Myeloma Microarray Gene Expression Data
    Liu, Qingzhong
    Sung, Andrew H.
    Chen, Zhongxue
    Liu, Jianzhong
    Huang, Xudong
    Deng, Youping
    PLOS ONE, 2009, 4 (12): : 1 - 24
  • [28] Hybrid Feature Selection Algorithm mRMR-ICA for Cancer Classification from Microarray Gene Expression Data
    Wang, Shuaiqun
    Kong, Wei
    Aorigele
    Deng, Jin
    Gao, Shangce
    Zeng, Weiming
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2018, 21 (06) : 420 - 430
  • [29] Improving classification accuracy of cancer types using parallel hybrid feature selection on microarray gene expression data
    Lokeswari Venkataramana
    Shomona Gracia Jacob
    Rajavel Ramadoss
    Dodda Saisuma
    Dommaraju Haritha
    Kunthipuram Manoja
    Genes & Genomics, 2019, 41 : 1301 - 1313
  • [30] Improving classification accuracy of cancer types using parallel hybrid feature selection on microarray gene expression data
    Venkataramana, Lokeswari
    Jacob, Shomona Gracia
    Ramadoss, Rajavel
    Saisuma, Dodda
    Haritha, Dommaraju
    Manoja, Kunthipuram
    GENES & GENOMICS, 2019, 41 (11) : 1301 - 1313