Fast and robust general purpose clustering algorithms

被引:32
|
作者
Estivill-Castro, V [1 ]
Yang, J
机构
[1] Griffith Univ, Sch Comp & Informat Technol, Nathan, Qld 4111, Australia
[2] Univ Western Sydney Macarthur, Sch Comp & Informat Technol, Campbelltown, NSW 2560, Australia
关键词
clustering; k-MEANS; medoids; 1-median problem; combinatorial optimization; EXPECTATION MAXIMIZATION;
D O I
10.1023/B:DAMI.0000015869.08323.b3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
General purpose and highly applicable clustering methods are usually required during the early stages of knowledge discovery exercises. k-MEANS has been adopted as the prototype of iterative model-based clustering because of its speed, simplicity and capability to work within the format of very large databases. However, k-MEANS has several disadvantages derived from its statistical simplicity. We propose an algorithm that remains very efficient, generally applicable, multidimensional but is more robust to noise and outliers. We achieve this by using medians rather than means as estimators for the centers of clusters. Comparison with k-MEANS, EXPECTATION MAXIMIZATION and GIBBS sampling demonstrates the advantages of our algorithm.
引用
收藏
页码:127 / 150
页数:24
相关论文
共 50 条
  • [31] Robust approach for textured image clustering
    Ennouni, Assia
    Sabri, My Abdelouahed.
    Senhaji, Saloua
    Aarab, Abdellah
    2016 4TH IEEE INTERNATIONAL COLLOQUIUM ON INFORMATION SCIENCE AND TECHNOLOGY (CIST), 2016, : 465 - 470
  • [32] Robust clustering
    Banerjee, Amit
    Dave, Rajesh N.
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2012, 2 (01) : 29 - 59
  • [33] Tk-Merge: Computationally Efficient Robust Clustering Under General Assumptions
    Insolia, Luca
    Perrotta, Domenico
    BUILDING BRIDGES BETWEEN SOFT AND STATISTICAL METHODOLOGIES FOR DATA SCIENCE, 2023, 1433 : 216 - 223
  • [34] Robust Clustering Using Hyperdimensional Computing
    Ge, Lulu
    Parhi, Keshab K.
    IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS, 2024, 5 : 102 - 116
  • [35] Improved and generalized learning strategies for dynamically fast and statistically robust evolutionary algorithms
    Dashora, Yogesh
    Kumar, Sanjeev
    Shukla, Nagesh
    Tiwari, M. K.
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2008, 21 (04) : 525 - 547
  • [36] A fast implementation of the ISODATA clustering algorithm
    Memarsadeghi, Nargess
    Mount, David M.
    Netanyahu, Nathan S.
    Le Moigne, Jacqueline
    INTERNATIONAL JOURNAL OF COMPUTATIONAL GEOMETRY & APPLICATIONS, 2007, 17 (01) : 71 - 103
  • [37] A Mobility-Aware General-Purpose Vehicular Ad-Hoc Network Clustering Scheme
    Song, Min
    Cuckov, Filip
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2010, 26 (03) : 897 - 911
  • [38] Brief Announcement: Fast and Better Distributed MapReduce Algorithms for k-Center Clustering
    Im, Sungjin
    Moseley, Benjamin
    SPAA'15: PROCEEDINGS OF THE 27TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2015, : 65 - 67
  • [39] Fast Clustering with Flexible Balance Constraints
    Liu, Hongfu
    Huang, Ziming
    Chen, Qi
    Li, Mingqin
    Fu, Yun
    Zhang, Lintao
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 743 - 750
  • [40] Constrained Clustering Problems: New Optimization Algorithms
    Ibn-Khedher, Hatem
    Hadji, Makhlouf
    Ibn Khedher, Mohamed
    Khebbache, Selma
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2021), PT II, 2021, 12855 : 159 - 170