Improving hierarchical cluster analysis: A new method with outlier detection and automatic clustering

被引:133
|
作者
Almeida, J. A. S. [1 ]
Barbosa, L. M. S. [1 ]
Pais, A. A. C. C. [1 ]
Formosinho, S. J. [1 ]
机构
[1] Univ Coimbra, Dept Quim, P-3004535 Coimbra, Portugal
关键词
clustering; unsupervised pattern recognition; hierarchical cluster analysis; single linkage; outlier removal;
D O I
10.1016/j.chemolab.2007.01.005
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Techniques based on agglomerative hierarchical clustering constitute one of the most frequent approaches in unsupervised clustering. Some are based on the single linkage methodology, which has been shown to produce good results with sets of clusters of various sizes and shapes. However, the application of this type of algorithms in a wide variety of fields has posed a number of problems, such as the sensitivity to outliers and fluctuations in the density of data points. Additionally, these algorithms do not usually allow for automatic clustering. In this work we propose a method to improve single linkage hierarchical cluster analysis (HCA), so as to circumvent most of these problems and attain the performance of most sophisticated new approaches. This completely automated method is based on a self-consistent outlier reduction approach, followed by the building-up of a descriptive function. This, in turn, allows to define natural clusters. Finally, the discarded objects may be optionally assigned to these clusters. The validation of the method is carried out by employing widely used data sets available from literature and others for specific purposes created by the authors. Our method is shown to be very efficient in a large variety of situations. (c) 2007 Elsevier B.V. All rights reserved.
引用
收藏
页码:208 / 217
页数:10
相关论文
共 50 条
  • [1] New outlier detection method based on fuzzy clustering
    Al-Zoubi, Moh'D Belal
    Al-Dahoud, Ali
    Yahya, Abdelfatah A.
    WSEAS Transactions on Information Science and Applications, 2010, 7 (05): : 681 - 690
  • [2] Automatic PAM clustering algorithm for outlier detection
    Zhu, Q. (qszhu@cqu.edu.cn), 1600, Academy Publisher (07):
  • [3] A cluster-based Outlier detection method without pre-clustering
    Ren, DM
    Wang, BY
    Perrizo, W
    COMPUTER APPLICATIONS IN INDUSTRY AND ENGINEERING, 2004, : 177 - 180
  • [4] An Automatic R-peak Detection Method Based on Hierarchical Clustering
    Chen, Hanjie
    Maharatna, Koushik
    2019 IEEE BIOMEDICAL CIRCUITS AND SYSTEMS CONFERENCE (BIOCAS 2019), 2019,
  • [5] Hierarchical agglomerative clustering based T-outlier detection
    Wang, Dajun
    Fortier, Paul J.
    Michel, Howard E.
    Mitsa, Theophano
    ICDM 2006: SIXTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, WORKSHOPS, 2006, : 731 - +
  • [6] Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection
    Campello, Ricardo J. G. B.
    Moulavi, Davoud
    Zimek, Arthur
    Sander, Joerg
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2015, 10 (01)
  • [7] A Heuristic Automatic Clustering Method Based on Hierarchical Clustering
    LaPlante, Francois
    Belacel, Nabil
    Kardouchi, Mustapha
    AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2014, 2015, 8946 : 312 - 328
  • [8] Clustering-Based Outlier Detection Method
    Jiang, Sheng-yi
    An, Qing-bo
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 429 - 433
  • [9] Fuzzy Outlier analysis a combined clustering - Outlier detection approach
    Yousri, Noha A.
    Ismail, Mohammed A.
    Kamel, Mohamed S.
    2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 1776 - +
  • [10] Outlier and group detection in sensory panels using hierarchical cluster analysis with the Procrustes distance
    Dahl, T
    Næs, T
    FOOD QUALITY AND PREFERENCE, 2004, 15 (03) : 195 - 208