Simultaneous feature selection and clustering of micro-array and RNA-sequence gene expression data using multiobjective optimization

被引:6
|
作者
Alok, Abhay Kumar [1 ]
Gupta, Pooja [2 ]
Saha, Sriparna [1 ]
Sharma, Vineet [2 ]
机构
[1] Indian Inst Technol, Comp Sci Engn, Patna, Bihar, India
[2] AKTU, Comp Sci Engn, Krishna Inst Engn & Technol, Lucknow, Uttar Pradesh, India
关键词
Gene expression data clustering; Feature selection; Point symmetry based distance; Multiobjective optimization; Cluster validity index; ALGORITHM; DISTANCE;
D O I
10.1007/s13042-020-01139-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we have devised a multiobjective optimization solution framework for solving the problem of gene expression data clustering in reduced feature space. Here clustering problem is viewed from two different aspects: clustering of genes in reduced sample space or clustering of samples in reduced gene space. Three objective functions: two internal cluster validity indices and the count on the number of features are optimized simultaneously by a popular multiobjective simulated annealing based approach, namely AMOSA. Here, point symmetry based distance is used for the assignment of gene data points to different clusters. Seven publicly available benchmark gene expression data sets are used for experimental purpose. Both aspects of clustering in reduced feature space is demonstrated. The proposed gene expression clustering technique outperforms the existing nine clustering techniques. Apart from this, also some statistical and biological significant tests have been carried out to show that the proposed FSC-MOO technique is more statistically and biologically enriched
引用
收藏
页码:2541 / 2563
页数:23
相关论文
共 50 条
  • [31] A comprehensive learning based swarm optimization approach for feature selection in gene expression data
    Easwaran, Subha
    Venugopal, Jothi Prakash
    Subramanian, Arul Antran Vijay
    Sundaram, Gopikrishnan
    Naseeba, Beebi
    HELIYON, 2024, 10 (17)
  • [32] Multivariate feature selection using random subspace classifiers for gene expression data
    Kamath, Vidya P.
    Hall, Lawrence O.
    Yeatman, Timothy J.
    Eschrich, Steven. A.
    PROCEEDINGS OF THE 7TH IEEE INTERNATIONAL SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, VOLS I AND II, 2007, : 1041 - +
  • [33] Gene Expression Data Analysis Using Feature Weighted Robust Fuzzy -Means Clustering
    Singh, Vikas
    Verma, Nishchal K. K.
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2023, 22 (01) : 99 - 105
  • [34] Feature Selection in Microarray Gene Expression Data Using Fisher Discriminant Ratio
    Sarbazi-Azad, Saeed
    Abadeh, Mohammad Saniee
    Abadi, Mehdi Irannejad Najaf
    2018 8TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2018, : 225 - 230
  • [35] Stable Feature Selection for Gene Expression using Enhanced Binary Particle Swarm Optimization
    Dhrif, Hassen
    Wuchty, Stefan
    ICAART: PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2020, : 437 - 444
  • [36] Feature selection using neighborhood entropy-based uncertainty measures for gene expression data classification
    Sun, Lin
    Zhang, Xiaoyu
    Qian, Yuhua
    Xu, Jiucheng
    Zhang, Shiguang
    INFORMATION SCIENCES, 2019, 502 : 18 - 41
  • [37] Unsupervised feature selection algorithm for multiclass cancer classification of gene expression RNA-Seq data
    Garcia-Diaz, Pilar
    Sanchez-Berriel, Isabel
    Martinez-Rojas, Juan A.
    Diez-Pascual, Ana M.
    GENOMICS, 2020, 112 (02) : 1916 - 1925
  • [38] A feature selection strategy using Markov clustering, for the optimization of brain tumor segmentation from MRI data
    Pisak-Lukats, Ioan-Marius
    Kovacs, Levente
    Szilagyi, Laszlo
    ACTA UNIVERSITATIS SAPIENTIAE INFORMATICA, 2022, 14 (02) : 316 - 337
  • [39] Robust simultaneous positive data clustering and unsupervised feature selection using generalized inverted Dirichlet mixture models
    Al Mashrgy, Mohamed
    Bdiri, Taoufik
    Bouguila, Nizar
    KNOWLEDGE-BASED SYSTEMS, 2014, 59 : 182 - 195
  • [40] Genetic Clustering Algorithm-Based Feature Selection and Divergent Random Forest for Multiclass Cancer Classification Using Gene Expression Data
    L. Senbagamalar
    S. Logeswari
    International Journal of Computational Intelligence Systems, 17