Simultaneous feature selection and clustering of micro-array and RNA-sequence gene expression data using multiobjective optimization

被引:6
|
作者
Alok, Abhay Kumar [1 ]
Gupta, Pooja [2 ]
Saha, Sriparna [1 ]
Sharma, Vineet [2 ]
机构
[1] Indian Inst Technol, Comp Sci Engn, Patna, Bihar, India
[2] AKTU, Comp Sci Engn, Krishna Inst Engn & Technol, Lucknow, Uttar Pradesh, India
关键词
Gene expression data clustering; Feature selection; Point symmetry based distance; Multiobjective optimization; Cluster validity index; ALGORITHM; DISTANCE;
D O I
10.1007/s13042-020-01139-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we have devised a multiobjective optimization solution framework for solving the problem of gene expression data clustering in reduced feature space. Here clustering problem is viewed from two different aspects: clustering of genes in reduced sample space or clustering of samples in reduced gene space. Three objective functions: two internal cluster validity indices and the count on the number of features are optimized simultaneously by a popular multiobjective simulated annealing based approach, namely AMOSA. Here, point symmetry based distance is used for the assignment of gene data points to different clusters. Seven publicly available benchmark gene expression data sets are used for experimental purpose. Both aspects of clustering in reduced feature space is demonstrated. The proposed gene expression clustering technique outperforms the existing nine clustering techniques. Apart from this, also some statistical and biological significant tests have been carried out to show that the proposed FSC-MOO technique is more statistically and biologically enriched
引用
收藏
页码:2541 / 2563
页数:23
相关论文
共 50 条
  • [41] Genetic Clustering Algorithm-Based Feature Selection and Divergent Random Forest for Multiclass Cancer Classification Using Gene Expression Data
    Senbagamalar, L.
    Logeswari, S.
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
  • [42] Impact of Feature Selection on Support Vector Machine Using Microarray Gene Expression Data
    Wahid, Choudhury Muhammad Mufassil
    Ali, A. B. M. Shawkat
    Tickle, Kevin
    2009 SECOND INTERNATIONAL CONFERENCE ON MACHINE VISION, PROCEEDINGS, ( ICMV 2009), 2009, : 189 - 193
  • [43] Gene expression data classification using genetic algorithm-based feature selection
    Sonmez, Oznur Sinem
    Dagtekin, Mustafa
    Ensari, Tolga
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (07) : 3165 - 3179
  • [44] Feature Selection Using Approximate Conditional Entropy Based on Fuzzy Information Granule for Gene Expression Data Classification
    Zhang, Hengyi
    FRONTIERS IN GENETICS, 2021, 12
  • [45] Gene Expression Dataset Classification Using Artificial Neural Network and Clustering-Based Feature Selection
    Mabu, Audu Musa
    Prasad, Rajesh
    Yadav, Raghav
    INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH, 2020, 11 (01) : 65 - 86
  • [46] Joint classifier and feature optimization for comprehensive cancer diagnosis using gene expression data
    Krishnapuram, B
    Carin, L
    Hartemink, AJ
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2004, 11 (2-3) : 227 - 242
  • [47] Simultaneous genes and training samples selection by modified particle swarm optimization for gene expression data classification
    Shen, Qi
    Mei, Zhen
    Ye, Bao-Xian
    COMPUTERS IN BIOLOGY AND MEDICINE, 2009, 39 (07) : 646 - 649
  • [48] Feature selection for classification of microarray gene expression cancers using Bacterial Colony Optimization with multi-dimensional population
    Wang, Hong
    Tan, Lijing
    Niu, Ben
    SWARM AND EVOLUTIONARY COMPUTATION, 2019, 48 : 172 - 181
  • [49] Feature selection of gene expression data for Cancer classification using double RBF-kernels
    Liu, Shenghui
    Xu, Chunrui
    Zhang, Yusen
    Liu, Jiaguo
    Yu, Bin
    Liu, Xiaoping
    Dehmer, Matthias
    BMC BIOINFORMATICS, 2018, 19
  • [50] Feature Selection in Gene Expression Data Using Principal Component Analysis and Rough Set Theory
    Mishra, Debahuti
    Dash, Rajashree
    Rath, Amiya Kumar
    Acharya, Milu
    SOFTWARE TOOLS AND ALGORITHMS FOR BIOLOGICAL SYSTEMS, 2011, 696 : 91 - 100