Research on Distributed Text Clustering Based on Frequent Itemset

被引：0

作者：

Yang, Wenchuan ^{[1
]}

Wu, Qiwei ^{[1
]}

Cheng, Zishuai ^{[1
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Network Secur, Beijing 100876, Peoples R China

来源：

PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017) | 2017年

基金：

中国国家自然科学基金;

关键词：

Text clustering; Frequent Itemset; Correlation analysis; Hadoop;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Text clustering, as a significant field in natural language processing, is a key technology of processing and organizing massive text data. In the era of big data, however, the massiveness of data brings great challenge in aspects of time and accuracy of text clustering. This paper focus on the issue of speed and preciseness in text clustering combined with genetic algorithm, feedback and distributed computing. A distributed text clustering method is proposed, and it is based on frequent Itemset. The examination result shows it can find out the global optimal centers more efficiently and make the clustering most accurate.

引用

页码：5700 / 5705

页数：6