Using Annealing to Accelerate Triangle Inequality k-means

被引:0
作者
Zhakubayev, Alibek [1 ]
Hamerly, Greg [1 ]
机构
[1] Baylor Univ, Dept Comp Sci, Waco, TX 76798 USA
来源
2024 IEEE 11TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS, DSAA 2024 | 2024年
关键词
k-means; clustering; machine learning; acceleration; annealing;
D O I
10.1109/DSAA61799.2024.10722770
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The k-means algorithm calculates the distances between all points and centers at every iteration, which results in a significant amount of wasted work and slows down the algorithm. Many algorithms have been proposed to increase the convergence speed by skipping unnecessary distance computations, often using distance bounds that are cheaply adjusted with the triangle inequality. This paper proposes an annealing technique that can further accelerate these algorithms by tightening the bounds. As a result, we skip even more distance computations and only calculate the distances when the chance of changing an assignment is high. The function that tightens the bound is adaptive and depends on the number of points that change the assignment in the last iteration. The annealing technique can speed up both Hamerly's and Elkan's accelerated algorithms without lowering the output quality. Experimental results showed a time improvement of up to 15 percent, representing a significant performance boost over the Hamerly algorithm.
引用
收藏
页码:128 / 136
页数:9
相关论文
共 23 条
[1]  
[Anonymous], 2009, LEARNING MULTIPLE LA
[2]  
[Anonymous], 2010, P 2010 SIAM INT C DA, DOI DOI 10.1137/1.9781611972801.12
[3]  
[Anonymous], 2003, P 20 INT C MACHINE L, DOI DOI 10.1016/0026-2714(92)90278-S
[4]  
[Anonymous], 2013, Ph.D. dissertation,
[5]  
Blackard J., 1998, UCI Machine Learning Repository, DOI DOI 10.24432/C50K5N
[6]  
Boutsidis Christos, 2010, Advances in neural information processing systems, V23
[7]   A comparative study of efficient initialization methods for the k-means clustering algorithm [J].
Celebi, M. Emre ;
Kingravi, Hassan A. ;
Vela, Patricio A. .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (01) :200-210
[8]   Dimensionality Reduction for k-Means Clustering and Low Rank Approximation [J].
Cohen, Michael B. ;
Elder, Sam ;
Musco, Cameron ;
Musco, Christopher ;
Persu, Madalina .
STOC'15: PROCEEDINGS OF THE 2015 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2015, :163-172
[9]  
Cup K., 2004, Protein homology dataset
[10]  
Deng L., IEEE Signal Processing Magazine, V29