An Improved K-means text clustering algorithm By Optimizing initial cluster centers

被引:0
|
作者
Xiong, Caiquan [1 ]
Hua, Zhen [1 ]
Lv, Ke [1 ]
Li, Xuan [1 ]
机构
[1] Hubei Univ Technol, Sch Comp Sci, Wuhan, Hubei, Peoples R China
来源
2016 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD) | 2016年
基金
中国国家自然科学基金;
关键词
K-means algorithm; initial cluster centers; Text clustering;
D O I
10.1109/CCBD.2016.29
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
K-means clustering algorithm is an influential algorithm in data mining. The traditional K-means algorithm has sensitivity to the initial cluster centers, leading to the result of clustering depends on the initial centers excessively. In order to overcome this shortcoming, this paper proposes an improved K-means text clustering algorithm by optimizing initial cluster centers. The algorithm first calculates the density of each data object in the data set, and then judge which data object is an isolated point. After removing all of isolated points, a set of data objects with high density is obtained. Afterwards, chooses k high density data objects as the initial cluster centers, where the distance between the data objects is the largest. The experimental results show that the improved K-means algorithm can improve the stability and accuracy of text clustering.
引用
收藏
页码:265 / 268
页数:4
相关论文
共 50 条
  • [1] Semi-supervised K-Means Clustering by Optimizing Initial Cluster Centers
    Wang, Xin
    Wang, Chaofei
    Shen, Junyi
    WEB INFORMATION SYSTEMS AND MINING, PT II, 2011, 6988 : 178 - +
  • [2] A new method for selecting initial cluster centers in k-means clustering algorithm
    Zhang, Guoying
    Sha, Yun
    He, Yuanjiao
    2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 2, 2008, : 879 - 883
  • [3] An Improved Three-Way K-Means Algorithm by Optimizing Cluster Centers
    Guo, Qihang
    Yin, Zhenyu
    Wang, Pingxin
    SYMMETRY-BASEL, 2022, 14 (09):
  • [4] Improved research to k-means initial cluster centers
    Zhang Min
    Duan Kai-fei
    2015 NINTH INTERNATIONAL CONFERENCE ON FRONTIER OF COMPUTER SCIENCE AND TECHNOLOGY FCST 2015, 2015, : 348 - 352
  • [5] On selecting the Initial Cluster Centers in the K-means Algorithm
    Tanir, Deniz
    Nuriyeva, Fidan
    2017 11TH IEEE INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT 2017), 2017, : 131 - 135
  • [6] A new algorithm for initial cluster centers in k-means algorithm
    Erisoglu, Murat
    Calis, Nazif
    Sakallioglu, Sadullah
    PATTERN RECOGNITION LETTERS, 2011, 32 (14) : 1701 - 1705
  • [7] Improved K-Means algorithm in text semantic clustering
    Ma, Junhong
    Open Cybernetics and Systemics Journal, 2014, 8 : 530 - 534
  • [8] K-means Clustering Algorithm with improved Initial Center
    Zhang Chen
    Xia Shixiong
    WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, : 790 - 792
  • [9] A modified parallel k-means clustering with improved initial centers
    Yu, Yuecheng
    Wang, Jiandong
    Zheng, Guansheng
    Gu, Bin
    Journal of Computational Information Systems, 2010, 6 (12): : 4091 - 4098
  • [10] An Optimized k-means Algorithm for Selecting Initial Clustering Centers
    Song, Jianhui
    Li, Xuefei
    Liu, Yanju
    INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2015, 9 (10): : 177 - 186