Improving k-means Clustering with Genetic Programming for Feature Construction

被引:3
|
作者
Lensen, Andrew [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
来源
PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION) | 2017年
关键词
Cluster Analysis; Feature Construction; Genetic Programming; k-means; Evolutionary Computation;
D O I
10.1145/3067695.3075962
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
k-means is one of the most commonly used clustering algorithms in data mining. Despite this, it has a number of fundamental limitations which prevent it from performing effectively on large or otherwise difficult datasets. A common technique to improve performance of data mining algorithms is feature construction, a technique which combines features together to produce more powerful constructed features that can improve the performance of a given algorithm. Genetic Programming (GP) has been used for feature construction very successfully, due to its program-like structure. This paper proposes two representations for using GP to perform feature construction to improve the performance of k-means, using a wrapper approach. Our results show significant improvements in performance compared to k-means using all original features across six difficult datasets.
引用
收藏
页码:237 / 238
页数:2
相关论文
共 50 条
  • [1] Statistically Improving K-means Clustering Performance
    Ihsanoglu, Abdullah
    Zaval, Mounes
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [2] Improving Land Cover Classification Using Genetic Programming for Feature Construction
    Batista, Joao E.
    Cabral, Ana I. R.
    Vasconcelos, Maria J. P.
    Vanneschi, Leonardo
    Silva, Sara
    REMOTE SENSING, 2021, 13 (09)
  • [3] Genetic Sampling k-means for Clustering Large Data Sets
    Luchi, Diego
    Santos, Willian
    Rodrigues, Alexandre
    Varejao, Flavio Miguel
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, CIARP 2015, 2015, 9423 : 691 - 698
  • [4] Subspace K-means clustering
    Timmerman, Marieke E.
    Ceulemans, Eva
    De Roover, Kim
    Van Leeuwen, Karla
    BEHAVIOR RESEARCH METHODS, 2013, 45 (04) : 1011 - 1023
  • [5] Representing the New Model for Improving K-Means Clustering Algorithm based on Genetic Algorithm
    Maghsoudi, Rouhollah
    Delavar, Arash Ghorbannia
    Hoseyny, Somayye
    Asgari, Rahmatollah
    Heidari, Yaghub
    JOURNAL OF MATHEMATICS AND COMPUTER SCIENCE-JMCS, 2011, 2 (02): : 329 - 336
  • [6] A genetic algorithm with gene rearrangement for K-means clustering
    Chang, Dong-Xia
    Zhang, Xian-Da
    Zheng, Chang-Wen
    PATTERN RECOGNITION, 2009, 42 (07) : 1210 - 1222
  • [7] Improving Clustering Method Performance Using K-Means, Mini Batch K-Means, BIRCH and Spectral
    Wahyuningrum, Tenia
    Khomsah, Siti
    Suyanto, Suyanto
    Meliana, Selly
    Yunanto, Prasti Eko
    Al Maki, Wikky F.
    2021 4TH INTERNATIONAL SEMINAR ON RESEARCH OF INFORMATION TECHNOLOGY AND INTELLIGENT SYSTEMS (ISRITI 2021), 2020,
  • [8] Optimization of K-Means clustering Using Genetic Algorithm
    Irfan, Shadab
    Dwivedi, Gaurav
    Ghosh, Subhajit
    2017 INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION TECHNOLOGIES FOR SMART NATION (IC3TSN), 2017, : 157 - 162
  • [9] An Enhanced K-Means Genetic Algorithms for Optimal Clustering
    Anusha, M.
    Sathiaseelan, J. G. R.
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 580 - 584
  • [10] The Global Kernel k-Means Algorithm for Clustering in Feature Space
    Tzortzis, Grigorios F.
    Likas, Aristidis C.
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 2009, 20 (07): : 1181 - 1194