Advancing Gene Expression Data Analysis: an Innovative Multi-objective Optimization Algorithm for Simultaneous Feature Selection and Clustering

被引:2
|
作者
Gupta, Pooja [1 ]
Alok, Abhay Kumar [2 ]
Sharma, Vineet [3 ]
机构
[1] Dr APJ Abdul Kalam Techn Univ, Lucknow, Uttar Pradesh, India
[2] Indian Inst Technol, Patna, India
[3] KIET Grp Inst, Ghaziabad, Delhi, India
关键词
Gene expression data Clustering; Feature selection; Point symmetry based distance; AMOSA; Cluster validity index; Feature weight index; ENSEMBLE; MODEL;
D O I
10.1590/1678-4324-2024230508
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Clustering algorithms play a crucial role in identifying co -expressed genes in microarray data, while feature subset identification is equally important when dealing with large data matrices. In this research paper, we address the problem of simultaneous feature selection and gene expression data clustering within a multiobjective optimization framework. Our approach employs the Archived multi -objective simulated annealing (AMOSA) algorithm to optimize a multi -objective function that incorporates two internal validity indices and a feature weight index. To determine data point membership in different clusters, we utilize a point symmetrybased distance metric. We demonstrate the effectiveness of our proposed approach on three publicly available gene expression datasets using the Silhouette index. Furthermore, we compare the clustering results of our approach, unsupervised feature selection and clustering using Multi -objective optimization framework (UFSC-MOO), to nine other existing techniques, showing its superior performance. Statistical significance is confirmed through Wilcoxon Rank Sum test. Also, biological significance test is employed to show that the obtained clustering solutions are biologically enriched.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] A multi-objective optimization algorithm for feature selection problems
    Abdollahzadeh, Benyamin
    Gharehchopogh, Farhad Soleimanian
    ENGINEERING WITH COMPUTERS, 2022, 38 (SUPPL 3) : 1845 - 1863
  • [2] Gravitational search algorithm and K-means for simultaneous feature selection and data clustering: a multi-objective approach
    Jay Prakash
    Pramod Kumar Singh
    Soft Computing, 2019, 23 : 2083 - 2100
  • [3] Gravitational search algorithm and K-means for simultaneous feature selection and data clustering: a multi-objective approach
    Prakash, Jay
    Singh, Pramod Kumar
    SOFT COMPUTING, 2019, 23 (06) : 2083 - 2100
  • [4] Multi-objective clustering ensemble for gene expression data analysis
    Faceli, Katti
    de Souto, Marcilio C. R.
    de Araujo, Daniel S. A.
    de Carvalho, Andre C. P. L. F.
    NEUROCOMPUTING, 2009, 72 (13-15) : 2763 - 2774
  • [5] A multi-objective optimization algorithm for feature selection problems
    Benyamin Abdollahzadeh
    Farhad Soleimanian Gharehchopogh
    Engineering with Computers, 2022, 38 : 1845 - 1863
  • [6] Simultaneous feature selection and clustering of micro-array and RNA-sequence gene expression data using multiobjective optimization
    Alok, Abhay Kumar
    Gupta, Pooja
    Saha, Sriparna
    Sharma, Vineet
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2020, 11 (11) : 2541 - 2563
  • [7] Simultaneous feature selection and clustering of micro-array and RNA-sequence gene expression data using multiobjective optimization
    Abhay Kumar Alok
    Pooja Gupta
    Sriparna Saha
    Vineet Sharma
    International Journal of Machine Learning and Cybernetics, 2020, 11 : 2541 - 2563
  • [8] Simultaneous Feature Selection and Unsupervised Clustering for Gene-Expression Data in Multiobjective Optimization Framework
    Alok, Abhay Kumar
    Kanekar, Neha
    Saha, Sriparna
    Ekbal, Asif
    2014 9TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2014, : 691 - 696
  • [9] Multi-objective Genetic Algorithm setup for Feature Subset Selection in Clustering
    Kashyap, Himanshu
    Das, Sohini
    Bhattacharjee, Jayee
    Halder, Ritu
    Goswami, Saptarsi
    2016 3RD INTERNATIONAL CONFERENCE ON RECENT ADVANCES IN INFORMATION TECHNOLOGY (RAIT), 2016, : 243 - 247
  • [10] Simultaneous Feature Selection and Semi-supervised Clustering for Gene-Expression Data
    Alok, Abhay Kumar
    Saha, Sriparna
    Ekbal, Asif
    Kanekar, Neha
    2015 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2015,