A Novel Text Clustering Method Based on TGSOM and Fuzzy K-Means

被引:0
作者
Hu, Jinzhu [1 ]
Xiong, Chunxiu [1 ]
Shu, Jiangbo [1 ]
Zhou, Xing [1 ]
Zhu, Jun [1 ]
机构
[1] Hua Zhong Normal Univ, Dept Comp Sci, Wuhan 43079, Peoples R China
来源
PROCEEDINGS OF THE FIRST INTERNATIONAL WORKSHOP ON EDUCATION TECHNOLOGY AND COMPUTER SCIENCE, VOL I | 2009年
关键词
tree-structured growing self-organizing maps; Fuzzy K-Means; text clustering; text clustering flow model;
D O I
10.1109/ETCS.2009.14
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
According to the high-dimensional sparse features of the storage of the textual document, and defects existing in the clustering methods which have already studied by now and some other problems, an effective text clustering approach (short for TGSOM-FS-FKM) based on tree-structured growing self-organizing maps (TGSOM) and Fuzzy K-Means (FKM) is proposed. It firstly makes preprocess of texts, and filter the majority of noisy words by using unsupervised feature selection method. Then it used TGSOM to execute the first clustering to get the rough classification of texts, and to get the initial clustering number and each text's category. And then introduced LSA theory to improve the precision of clustering and reduce the dimension of feature vector. After that it used TGSOM to execute the second clustering to get the more precise clustering result, and used supervised feature selection method to select feature items. Finally, it used FKM to cluster the result set. In the experiment, it remained the same number of feature items. Experimental results indicate that TGSOM-FS-FKM clustering excels to other clustering method such as DSOM-FS-FCM, and the precision is better than DSOM-FCM, DFKCN and FDMFC clustering.
引用
收藏
页码:26 / 30
页数:5
相关论文
共 8 条
  • [1] GENG XQ, 2003, COMPUTER ENG
  • [2] GONG J, 2006, HUMAN ENV BIOL POLYT
  • [3] GUO YF, 2007, TECHNICAL ACOUSTICS
  • [4] HE ZS, 2007, CHONGQING U NATURAL
  • [5] SIEGWART R, 2008, INTRO AUTONOMOUS MOB
  • [6] WANG LWZ, 2003, ELECT INFORM TECHNOL
  • [7] XIONG ZY, 2008, COMPUTER APPL
  • [8] YE P, 2007, CHANGCHUN I TECHNOLO