Improving k-means Clustering with Genetic Programming for Feature Construction

被引:3
|
作者
Lensen, Andrew [1 ]
Xue, Bing [1 ]
Zhang, Mengjie [1 ]
机构
[1] Victoria Univ Wellington, Sch Engn & Comp Sci, POB 600, Wellington 6140, New Zealand
来源
PROCEEDINGS OF THE 2017 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION (GECCO'17 COMPANION) | 2017年
关键词
Cluster Analysis; Feature Construction; Genetic Programming; k-means; Evolutionary Computation;
D O I
10.1145/3067695.3075962
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
k-means is one of the most commonly used clustering algorithms in data mining. Despite this, it has a number of fundamental limitations which prevent it from performing effectively on large or otherwise difficult datasets. A common technique to improve performance of data mining algorithms is feature construction, a technique which combines features together to produce more powerful constructed features that can improve the performance of a given algorithm. Genetic Programming (GP) has been used for feature construction very successfully, due to its program-like structure. This paper proposes two representations for using GP to perform feature construction to improve the performance of k-means, using a wrapper approach. Our results show significant improvements in performance compared to k-means using all original features across six difficult datasets.
引用
收藏
页码:237 / 238
页数:2
相关论文
共 50 条
  • [21] A K-means Text Clustering Algorithm Based on Subject Feature Vector
    Duo, Ji
    Zhang, Peng
    Hao, Liu
    JOURNAL OF WEB ENGINEERING, 2021, 20 (06): : 1935 - 1946
  • [22] An Immune Genetic K-Means Algorithm for Mongolian Elements Clustering
    Hua, Chun
    Cheng, Chun Ying
    ADVANCES IN NEURAL NETWORKS - ISNN 2018, 2018, 10878 : 273 - 278
  • [23] PSO Aided k-Means Clustering: Introducing Connectivity in k-Means
    Breaban, Mihaela Elena
    Luchian, Henri
    GECCO-2011: PROCEEDINGS OF THE 13TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2011, : 1227 - 1234
  • [24] K-Means Extensions for Clustering Categorical Data
    Alwersh, Mohammed
    Kovacs, Laszlo
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (09) : 492 - 507
  • [25] Online K-Means Clustering with Lightweight Coresets
    Low, Jia Shun
    Ghafoori, Zahra
    Leckie, Christopher
    AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 191 - 202
  • [26] Improving Intrusion Detection Using PCA And K-Means Clustering Algorithm
    Khaoula, Radi
    Mohamed, Moughit
    2022 9TH INTERNATIONAL CONFERENCE ON WIRELESS NETWORKS AND MOBILE COMMUNICATIONS, WINCOM, 2022, : 19 - 23
  • [27] Soil data clustering by using K-means and fuzzy K-means algorithm
    Hot, Elma
    Popovic-Bugarin, Vesna
    2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 890 - 893
  • [28] Improving Bregman k-means
    Ashour, Wesam
    Fyfe, Colin
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2014, 6 (01) : 65 - 82
  • [29] K*-Means: An Effective and Efficient K-means Clustering Algorithm
    Qi, Jianpeng
    Yu, Yanwei
    Wang, Lihong
    Liu, Jinglei
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, : 242 - 249
  • [30] Efficient Genetic K-Means Clustering for Health Care Knowledge Discovery
    Alsayat, Ahmed
    El-Sayed, Hoda
    2016 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS (SERA), 2016, : 45 - 52