An Improved K-means Clustering Algorithm Based on Normal Matrix

被引：0

作者：

Tian Shengwen ^{[1
]}

Zhao Yongsheng ^{[1
]}

Wang Yilei ^{[1
]}

机构：

[1] Ludong Univ, Sch Comp Sci & Technol, Yantai 264025, Peoples R China

来源：

PROCEEDINGS OF THE SECOND INTERNATIONAL SYMPOSIUM ON TEST AUTOMATION AND INSTRUMENTATION, VOL 4 | 2008年

关键词：

K-means; Normal matrix; Similarity; Eigenvector;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

It is well known that K-means algorithm is very sensitive to outliers, and often terminates at a local optimum. Therefore, the quality of the result is not satisfactory. Furthermore, it is usually designed as memory-resident algorithms, which limit the scalability. In this paper, we an improved K-means clustering algorithm - NK-means is developed. NK-means is based on spectral methods, namely uses Normal matrix that is used in spectral analysis approaches to normalize original datasets, and then finds clusters in the processed datasets by K-means algorithm. During the normalization, weighted adjacency matrix is constructed by the similarity between objects belonged in original datasets, and to generate Normal matrix. NK-means algorithm mainly uses the characteristic structure to process original datasets. In the processing, the data is transformed from a matrix into a vector, and the structures of the clusters are changed from three-dimensional sphere into two-dimensional circle. The experiment shows that NK-means algorithm significantly outperforms K-means in the efficiency and accuracy.

引用

页码：2182 / 2185

页数：4

共 5 条

[1] Detecting communities in large networks [J].

Capocci, A ;

Servedio, VDP ;

Caldarelli, G ;

Colaiori, F .

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2005, 352 (2-4) :669-676

[2]

Han J., 2001, Data Mining Concepts and Techniques

[3]

MacQueen J., 1967, Proc fifth Berkeley Symp Math Stat Probab, V1, P281

[4]

Qian Wei-ning, 2002, Journal of Software, V13, P1382

[5]

WANG X F, 2006, THEORY APPL COMPLEX

← 1 →