An Improved K-means text clustering algorithm By Optimizing initial cluster centers

被引:0
|
作者
Xiong, Caiquan [1 ]
Hua, Zhen [1 ]
Lv, Ke [1 ]
Li, Xuan [1 ]
机构
[1] Hubei Univ Technol, Sch Comp Sci, Wuhan, Hubei, Peoples R China
来源
2016 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA (CCBD) | 2016年
基金
中国国家自然科学基金;
关键词
K-means algorithm; initial cluster centers; Text clustering;
D O I
10.1109/CCBD.2016.29
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
K-means clustering algorithm is an influential algorithm in data mining. The traditional K-means algorithm has sensitivity to the initial cluster centers, leading to the result of clustering depends on the initial centers excessively. In order to overcome this shortcoming, this paper proposes an improved K-means text clustering algorithm by optimizing initial cluster centers. The algorithm first calculates the density of each data object in the data set, and then judge which data object is an isolated point. After removing all of isolated points, a set of data objects with high density is obtained. Afterwards, chooses k high density data objects as the initial cluster centers, where the distance between the data objects is the largest. The experimental results show that the improved K-means algorithm can improve the stability and accuracy of text clustering.
引用
收藏
页码:265 / 268
页数:4
相关论文
共 50 条
  • [31] Using Genetic Algorithm for Selection of Initial Cluster Centers for the K-Means Method
    Kwedlo, Wojciech
    Iwanowicz, Piotr
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, PT II, 2010, 6114 : 165 - 172
  • [32] A new Chinese text clustering algorithm based on WRD and improved K-means
    Cui, Zicai
    Zhong, Bocheng
    Bai, Chen
    INTELLIGENT DATA ANALYSIS, 2023, 27 (04) : 1205 - 1220
  • [33] Research on Improved K-means Clustering Algorithm
    Zhang, Yinsheng
    Shan, Huilin
    Li, Jiaqiang
    Zhou, Jie
    MEMS, NANO AND SMART SYSTEMS, PTS 1-6, 2012, 403-408 : 1977 - 1980
  • [34] An Improved Kernel K-means Clustering Algorithm
    Liu, Yang
    Yin, Hong Peng
    Chai, Yi
    PROCEEDINGS OF 2016 CHINESE INTELLIGENT SYSTEMS CONFERENCE, VOL I, 2016, 404 : 275 - 280
  • [35] Research on improved K-means clustering algorithm
    Zhang, Yinsheng
    Shan, Huilin
    Li, Jiaqiang
    Zhou, Jie
    Advanced Materials Research, 2012, 403-408 : 1977 - 1980
  • [36] Improved K-means Algorithm to Quickly Locate Optimum Initial Clustering Number K
    Yang Qing
    Liu Ye
    Zhang Dongxu
    Liu Chang
    2011 30TH CHINESE CONTROL CONFERENCE (CCC), 2011, : 3319 - 3322
  • [37] An Improved K-means Algorithm for Document Clustering
    Wu, Guohua
    Lin, Hairong
    Fu, Ershuai
    Wang, Liuyang
    2015 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND MECHANICAL AUTOMATION (CSMA), 2015, : 65 - 69
  • [38] Chinese text clustering algorithm based k-means
    Yao, Mingyu
    Pi, Dechang
    Cong, Xiangxiang
    2012 INTERNATIONAL CONFERENCE ON MEDICAL PHYSICS AND BIOMEDICAL ENGINEERING (ICMPBE2012), 2012, 33 : 301 - 307
  • [39] Chinese Text Clustering Algorithm Based K-Means
    Yao, Mingyu
    Pi, Dechang
    Cong, Xiangxiang
    2011 AASRI CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRY APPLICATION (AASRI-AIIA 2011), VOL 1, 2011, : 90 - 93
  • [40] Weighted k-Means Algorithm Based Text Clustering
    Chen, Xiuguo
    Yin, Wensheng
    Tu, Pinghui
    Zhang, Hengxi
    IEEC 2009: FIRST INTERNATIONAL SYMPOSIUM ON INFORMATION ENGINEERING AND ELECTRONIC COMMERCE, PROCEEDINGS, 2009, : 51 - +