Simultaneous feature selection and symmetry based clustering using multiobjective framework

被引:19
|
作者
Saha, Sriparna [1 ]
Spandana, Rachamadugu [1 ]
Ekbal, Asif [1 ]
Bandyopadhyay, Sanghamitra [2 ]
机构
[1] Indian Inst Technol Patna, Dept Comp Sci & Engn, Patna, Bihar, India
[2] Indian Stat Inst Kolkata, Machine Intelligence Unit, Kolkata, India
关键词
Clustering; Multiobjective optimization (MOO); Symmetry; Automatic feature selection; Automatic determination of number of clusters; OPTIMIZATION;
D O I
10.1016/j.asoc.2014.12.009
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper a new framework based on multiobjective optimization (MOO), namely FeaClusMOO, is proposed which is capable of identifying the correct partitioning as well as the most relevant set of features from a data set. A newly developed multiobjective simulated annealing based optimization technique namely archived multiobjective simulated annealing (AMOSA) is used as the background strategy for optimization. Here features and cluster centers are encoded in the form of a string. As the objective functions, two internal cluster validity indices measuring the goodness of the obtained partitioning using Euclidean distance and point symmetry based distance, respectively, and a count on the number of features are utilized. These three objectives are optimized simultaneously using AMOSA in order to detect the appropriate subset of features, appropriate number of clusters as well as the appropriate partitioning. Points are allocated to different clusters using a point symmetry based distance. Mutation changes the feature combination as well as the set of cluster centers. Since AMOSA, like any other MOO technique, provides a set of solutions on the final Pareto front, a technique based on the concept of semi-supervised classification is developed to select a solution from the given set. The effectiveness of the proposed FeaClustMOO in comparison with other clustering techniques like its Euclidean distance based version where Euclidean distance is used for cluster assignment, a genetic algorithm based automatic clustering technique (VGAPS-clustering) using point symmetry based distance with all the features, K-means clustering technique with all features is shown for seven higher dimensional data sets obtained from real-life. (C) 2015 Published by Elsevier B.V.
引用
收藏
页码:479 / 486
页数:8
相关论文
共 50 条
  • [1] Simultaneous Feature Selection and Unsupervised Clustering for Gene-Expression Data in Multiobjective Optimization Framework
    Alok, Abhay Kumar
    Kanekar, Neha
    Saha, Sriparna
    Ekbal, Asif
    2014 9TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2014, : 691 - 696
  • [2] Simultaneous feature selection and clustering using mixture models
    Law, MHC
    Figueiredo, MAT
    Jain, AK
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (09) : 1154 - 1166
  • [3] Feature Selection and Semi-supervised Clustering Using Multiobjective Optimization
    Alok, Abhay Kumar
    Saha, Sriparna
    Ekbal, Asif
    2014 INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE ISCMI 2014, 2014, : 126 - 129
  • [4] Feature selection and semi-supervised clustering using multiobjective optimization
    Saha, Sriparna
    Ekbal, Asif
    Alok, Abhay Kumar
    Spandana, Rachamadugu
    SPRINGERPLUS, 2014, 3
  • [5] A Multiobjective Simultaneous Learning Framework for Clustering and Classification
    Cai, Weiling
    Chen, Songcan
    Zhang, Daoqiang
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (02): : 185 - 200
  • [6] HMOSHSSA: a novel framework for solving simultaneous clustering and feature selection problems
    Kumar, Vijay
    Kumari, Rajani
    Kumar, Sandeep
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (35) : 82149 - 82175
  • [7] A Framework for Feature Selection in Clustering
    Witten, Daniela M.
    Tibshirani, Robert
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2010, 105 (490) : 713 - 726
  • [8] A Feature Selection Framework Based on Supervised Data Clustering
    Liu, Hongzhi
    Fu, Bin
    Jiang, Zhengshen
    Wu, Zhonghai
    Hsu, D. Frank
    2016 IEEE 15TH INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS & COGNITIVE COMPUTING (ICCI*CC), 2016, : 316 - 321
  • [9] Gene expression data clustering using a multiobjective symmetry based clustering technique
    Saha, Sriparna
    Ekbal, Asif
    Gupta, Kshitija
    Bandyopadhyay, Sanghamitra
    COMPUTERS IN BIOLOGY AND MEDICINE, 2013, 43 (11) : 1965 - 1977
  • [10] An evolutionary parallel multiobjective feature selection framework
    Kiziloz, Hakan Ezgi
    Deniz, Ayca
    COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 159 (159)