DISCERN: diversity-based selection of centroids for k-estimation and rapid non-stochastic clustering

被引:8
作者
Hassani, Ali [1 ]
Iranmanesh, Amir [1 ]
Eftekhari, Mahdi [4 ]
Salemi, Abbas [2 ,3 ]
机构
[1] Shahid Bahonar Univ Kerman, Dept Comp Sci, Pajoohesh Sq, Kerman 7616914111, Iran
[2] Shahid Bahonar Univ Kerman, Dept Appl Math, Pajoohesh Sq, Kerman 7616914111, Iran
[3] Shahid Bahonar Univ Kerman, Mahani Math Res Ctr, Pajoohesh Sq, Kerman 7616914111, Iran
[4] Shahid Bahonar Univ Kerman, Dept Comp Engn, Pajoohesh Sq, Kerman 7616914111, Iran
关键词
Clustering; K-means initialization; Estimating the number of clusters; Unsupervised learning; Deterministic K-means;
D O I
10.1007/s13042-020-01193-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One of the applications of center-based clustering algorithms such as K-means is partitioning data points intoKclusters. In some examples, the feature space relates to the underlying problem we are trying to solve, and sometimes we can obtain a suitable feature space. Nevertheless, while K-means is one of the most efficient offline clustering algorithms, it is not equipped to estimate the number of clusters, which is useful in some practical cases. Other practical methods which do are simply too complex, as they require at least one run of K-means for each possibleK. In order to address this issue, we propose a K-means initialization similar to K-means++, which would be able to estimateKbased on the feature space while finding suitable initial centroids for K-means in a deterministic manner. Then we compare the proposed method, DISCERN, with a few of the most practicalKestimation methods, while also comparing clustering results of K-means when initialized randomly, using K-means++ and using DISCERN. The results show improvement in both the estimation and final clustering performance.
引用
收藏
页码:635 / 649
页数:15
相关论文
共 41 条
[1]  
Abadi M., 2015, TensorFlow: Large-scale machine learning on heterogeneous systems
[2]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[3]  
Ankerst M., 1999, SIGMOD Record, V28, P49, DOI 10.1145/304181.304187
[4]  
Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
[5]   Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection [J].
Belhumeur, PN ;
Hespanha, JP ;
Kriegman, DJ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (07) :711-720
[6]  
Blishen B, 2001, PRESTIGE PRESTIGE CA
[7]  
Cai Z, 2019, APPL INTELL SYST MUL, DOI [10.1007/978-3-030-15740-1_124, DOI 10.1007/978-3-030-15740-1_124]
[8]  
Caron M, 2020, ADV NEUR IN, V33
[9]   Deep Clustering for Unsupervised Learning of Visual Features [J].
Caron, Mathilde ;
Bojanowski, Piotr ;
Joulin, Armand ;
Douze, Matthijs .
COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 :139-156
[10]   An ordered clustering algorithm based on K-means and the PROMETHEE method [J].
Chen, Liuhao ;
Xu, Zeshui ;
Wang, Hai ;
Liu, Shousheng .
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (06) :917-926