Fast and robust general purpose clustering algorithms

被引:32
|
作者
Estivill-Castro, V [1 ]
Yang, J
机构
[1] Griffith Univ, Sch Comp & Informat Technol, Nathan, Qld 4111, Australia
[2] Univ Western Sydney Macarthur, Sch Comp & Informat Technol, Campbelltown, NSW 2560, Australia
关键词
clustering; k-MEANS; medoids; 1-median problem; combinatorial optimization; EXPECTATION MAXIMIZATION;
D O I
10.1023/B:DAMI.0000015869.08323.b3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
General purpose and highly applicable clustering methods are usually required during the early stages of knowledge discovery exercises. k-MEANS has been adopted as the prototype of iterative model-based clustering because of its speed, simplicity and capability to work within the format of very large databases. However, k-MEANS has several disadvantages derived from its statistical simplicity. We propose an algorithm that remains very efficient, generally applicable, multidimensional but is more robust to noise and outliers. We achieve this by using medians rather than means as estimators for the centers of clusters. Comparison with k-MEANS, EXPECTATION MAXIMIZATION and GIBBS sampling demonstrates the advantages of our algorithm.
引用
收藏
页码:127 / 150
页数:24
相关论文
共 50 条
  • [1] Fast and Robust General Purpose Clustering Algorithms
    V. Estivill-Castro
    J. Yang
    Data Mining and Knowledge Discovery, 2004, 8 : 127 - 150
  • [2] FAST ISODATA CLUSTERING ALGORITHMS
    VENKATESWARLU, NB
    RAJU, PSVSK
    PATTERN RECOGNITION, 1992, 25 (03) : 335 - 342
  • [3] Fast and robust clustering of general-shaped structures with tk-merge
    Insolia, Luca
    Perrotta, Domenico
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2024, 168
  • [4] A fast general-purpose clustering algorithm based on FPGAs for high-throughput data processing
    Annovi, A.
    Beretta, M.
    NUCLEAR INSTRUMENTS & METHODS IN PHYSICS RESEARCH SECTION A-ACCELERATORS SPECTROMETERS DETECTORS AND ASSOCIATED EQUIPMENT, 2010, 617 (1-3) : 254 - 257
  • [5] A fast algorithm for robust constrained clustering
    Fritz, Heinrich
    Garcia-Escudero, Luis A.
    Mayo-Iscar, Agustin
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2013, 61 : 124 - 136
  • [6] MapReduce algorithms for robust center-based clustering in doubling metrics
    Dandolo, Enrico
    Mazzetto, Alessio
    Pietracaprina, Andrea
    Pucci, Geppino
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2024, 194
  • [7] Robust Model Design by Comparative Evaluation of Clustering Algorithms
    Chen, Xiaopeng
    Park, Chanseok
    Gao, Xuehong
    Kim, Bosung
    IEEE ACCESS, 2023, 11 : 88135 - 88151
  • [8] Online clustering algorithms
    Barbakh, Wesam
    Fyfe, Colin
    INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2008, 18 (03) : 185 - 194
  • [9] ROBUST BREGMAN CLUSTERING
    Brecheteau, Claire
    Fischer, Aurelie
    Levrard, Clement
    ANNALS OF STATISTICS, 2021, 49 (03) : 1679 - 1701
  • [10] Comparison of the performance of center-based clustering algorithms
    Zhang, B
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, 2003, 2637 : 63 - 74