THE EVALUATION OF CHF COEFFICIENT IN DETERMINING THE NUMBER OF CLUSTERS USING EUCLIDEAN DISTANCE MEASURE

被引:0
|
作者
Loester, Tomas [1 ]
机构
[1] Univ Econ, Dept Stat & Probabil, Prague 13067 3, Czech Republic
来源
8TH INTERNATIONAL DAYS OF STATISTICS AND ECONOMICS | 2014年
关键词
clustering; evaluating of clustering; methods; CHF coefficient;
D O I
暂无
中图分类号
F [经济];
学科分类号
02 ;
摘要
There are many methods of clustering in current literature and there is possible to use the various measures of distances (resp. similarities), and therefore the resulting distributions of objects into the clusters may be different. There is no strict rule in the literature to determine which method is necessary to use and in which conditions. Also, one part of the cluster analysis is very often to determine the number of clusters. The aim of this paper is to evaluate the CHF coefficient, which is often used to determine the number of clusters. There will be used 20 artificially created files for clustering. Conditions are, that the clusters must be touched or partially overlapped. These generated files are created under the same conditions in order to consider the objective results. Based on analyses it was found, that the CHF coefficient is very successful in determining the number of clusters. In the case that the clusters are touched to each other, its success is 100 % at the generated files. In the case that the clusters are partially overlapped, its success decreases. The highest success in the case of partially overlapped clusters was 70 %. It can be concluded, that the lower rate of separation is, (i.e. the more individual clusters are overlapped), the lower is the success of this coefficient in order to determine the number of clusters.
引用
收藏
页码:858 / 869
页数:12
相关论文
共 9 条
  • [1] EVALUATION OF COEFFICIENTS FOR DETERMINING THE OPTIMAL NUMBER OF CLUSTERS IN CLUSTER ANALYSIS ON REAL DATA SETS
    Loster, Tomas
    9TH INTERNATIONAL DAYS OF STATISTICS AND ECONOMICS, 2015, : 1014 - 1023
  • [2] A Clustering Algorithm for Automatically Determining the Number of Clusters Based on Coefficient of Variation
    Liu, Tengteng
    Qu, Shouning
    Zhang, Kun
    PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON BIG DATA RESEARCH (ICBDR 2018), 2018, : 100 - 106
  • [3] Determining the number of clusters using information entropy for mixed data
    Liang, Jiye
    Zhao, Xingwang
    Li, Deyu
    Cao, Fuyuan
    Dang, Chuangyin
    PATTERN RECOGNITION, 2012, 45 (06) : 2251 - 2265
  • [4] Distance based k-means clustering algorithm for determining number of clusters for high dimensional data
    Alibuhtto, Mohamed Cassim
    Mahat, Nor Idayu
    DECISION SCIENCE LETTERS, 2020, 9 (01) : 51 - 58
  • [5] Performance evaluation of main approaches for determining optimal number of clusters in wireless sensor networks
    Benmahdi, Meryem Bochra
    Lehsaini, Mohamed
    INTERNATIONAL JOURNAL OF AD HOC AND UBIQUITOUS COMPUTING, 2020, 33 (03) : 184 - 195
  • [6] Automatically Determining the Number of Clusters Using Decision-Theoretic Rough Set
    Yu, Hong
    Liu, Zhanguo
    Wang, Guoyin
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2011, 6954 : 504 - 513
  • [7] Artificial Intelligence approach for Classifying Molecular Dataset using Density based technique with appropriate Euclidean Distance measure
    Prasad, K. Sai
    Chander, O. Subhash
    Reddy, G. Prabhakar
    Gururaj, S.
    MATERIALS TODAY-PROCEEDINGS, 2017, 4 (08) : 8827 - 8836
  • [8] Evaluation of the clustering of video frames using Rank and Histogram methods with Euclidean and City Block distance measurement for different levels of threshold
    Galarza Zambrano, Eddie
    Guil Mata, Nicolas
    Ramos Cozar, Julian
    2015 IEEE 6TH LATIN AMERICAN SYMPOSIUM ON CIRCUITS & SYSTEMS (LASCAS), 2015,
  • [9] Evaluation of an ensemble-based distance statistic for clustering MLST datasets using epidemiologically defined clusters of cyclosporiasis
    Nascimento, Fernanda S.
    Barratt, Joel
    Houghton, Katelyn
    Plucinski, Mateusz
    Kelley, Julia
    Casillas, Shannon
    Bennett, Carolyne
    Snider, Cathy
    Tuladhar, Rashmi
    Zhang, Jenny
    Clemons, Brooke
    Madison-Antenucci, Susan
    Russell, Alexis
    Cebelinski, Elizabeth
    Haan, Jisun
    Robinson, Trisha
    Arrowood, Michael J.
    Talundzic, Eldin
    Bradbury, Richard S.
    Qvarnstrom, Yvonne
    EPIDEMIOLOGY AND INFECTION, 2020, 148