Robust trimmed k-means

被引:6
|
作者
Dorabiala, Olga [1 ]
Kutz, J. Nathan [1 ]
Aravkin, Aleksandr Y. [1 ]
机构
[1] Univ Washington, Dept Appl Math, Seattle, WA 98195 USA
关键词
k-Means; Clustering; Robust statistics; Trimming; Unsupervised learning; OUTLIER DETECTION; FRAMEWORK;
D O I
10.1016/j.patrec.2022.07.007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a fundamental tool in unsupervised learning, used to group objects by distinguishing be-tween similar and dissimilar features of a given data set. One of the most common clustering algorithms is k-means. Unfortunately, when dealing with real-world data many traditional clustering algorithms are compromised by lack of clear separation between groups, noisy observations, and/or outlying data points. Thus, robust statistical algorithms are required for successful data analytics. Current methods that robus-tify k-means clustering are specialized for either single or multi-membership data, but do not perform competitively in both cases. We propose an extension of the k-means algorithm, which we call Robust Trimmed k-means (RTKM) that simultaneously identifies outliers and clusters points and can be applied to either single-or multi-membership data. We test RTKM on various real-world datasets and show that RTKM performs competitively with other methods on single membership data with outliers and multi -membership data without outliers. We also show that RTKM leverages its relative advantages to outper-form other methods on multi-membership data containing outliers. (c) 2022 Published by Elsevier B.V.
引用
收藏
页码:9 / 16
页数:8
相关论文
共 50 条
  • [31] Global k-means plus plus : an effective relaxation of the global k-means clustering algorithm
    Vardakas, Georgios
    Likas, Aristidis
    APPLIED INTELLIGENCE, 2024, 54 (19) : 8876 - 8888
  • [32] Fuzzy PCA-Guided Robust k-Means Clustering
    Honda, Katsuhiro
    Notsu, Akira
    Ichihashi, Hidetomo
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2010, 18 (01) : 67 - 79
  • [33] Empirical Evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means Clustering Algorithms
    Banerjee, Shreya
    Choudhary, Ankit
    Pal, Somnath
    2015 IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2015, : 172 - 176
  • [34] Robust Intrusion Detection Algorithm Based on K-means and BP
    Zhong, Yangjun
    Zhang, Shuiping
    INTELLIGENT STRUCTURE AND VIBRATION CONTROL, PTS 1 AND 2, 2011, 50-51 : 634 - 638
  • [35] The k-means Algorithm: A Comprehensive Survey and Performance Evaluation
    Ahmed, Mohiuddin
    Seraj, Raihan
    Islam, Syed Mohammed Shamsul
    ELECTRONICS, 2020, 9 (08) : 1 - 12
  • [36] Clustering of Image Data Using K-Means and Fuzzy K-Means
    Rahmani, Md. Khalid Imam
    Pal, Naina
    Arora, Kamiya
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (07) : 160 - 163
  • [37] Improving Clustering Method Performance Using K-Means, Mini Batch K-Means, BIRCH and Spectral
    Wahyuningrum, Tenia
    Khomsah, Siti
    Suyanto, Suyanto
    Meliana, Selly
    Yunanto, Prasti Eko
    Al Maki, Wikky F.
    2021 4TH INTERNATIONAL SEMINAR ON RESEARCH OF INFORMATION TECHNOLOGY AND INTELLIGENT SYSTEMS (ISRITI 2021), 2020,
  • [38] Density K-means : A New Algorithm for Centers Initialization for K-means
    Lan, Xv
    Li, Qian
    Zheng, Yi
    PROCEEDINGS OF 2015 6TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE, 2015, : 958 - 961
  • [39] PSO Aided k-Means Clustering: Introducing Connectivity in k-Means
    Breaban, Mihaela Elena
    Luchian, Henri
    GECCO-2011: PROCEEDINGS OF THE 13TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2011, : 1227 - 1234
  • [40] The global Minmax k-means algorithm
    Wang, Xiaoyan
    Bai, Yanping
    SPRINGERPLUS, 2016, 5