Research of feature selection for text clustering based on cloud model

被引:4
作者
Zhao, Junmin [1 ]
Zhang, Kai [1 ]
Wan, Jian [2 ]
机构
[1] Henan University of Urban Construction, Institute of Computer Science and Engineering, Pingdingshan
[2] ZhengZhou ShiYi Technology Co. Ltd, Zhengzhou
关键词
Cloud model; Feature selection; K-means algorithm; TF-IDF;
D O I
10.4304/jsw.8.12.3246-3252
中图分类号
学科分类号
摘要
Text clustering belongs to the unsupervised machine learning, the discriminability of class attributes cannot be measured in clustering. And the traditional text feature selection methods cannot effectively solve the high-dimensional problem. To overcome the weakness in existing feature selection, this paper proposes a new method which introduces the cloud model theory into feature selection, constructs the clouds filter for clustering documents. The distribution of document words is constructed in a microcosmic level. By employing the cloud model digital characteristics we can better compute the separability between feature words. Experimental results with K-means algorithm show that our method can remarkably improve the accuracy of text clustering. © 2013 Academy Publisher.
引用
收藏
页码:3246 / 3252
页数:6
相关论文
共 50 条
  • [21] Feature Selection in Text Clustering Applications of Literary Texts: A Hybrid of Term Weighting Methods
    Omar, Abdulfattah
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (02) : 99 - 107
  • [22] Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering
    Bharti, Kusum Kumari
    Singh, Pramod Kumar
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (06) : 3105 - 3114
  • [23] Feature selection based on feature interactions with application to text categorization
    Tang, Xiaochuan
    Dai, Yuanshun
    Xiang, Yanping
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 120 : 207 - 216
  • [24] Text Feature Selection Based on Class Subspace
    Zhou, Xiaofei
    Guo, Li
    Wang, Tianyi
    Hu, Yue
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 267 - 273
  • [25] Research and Experiment of Radar Signal Support Vector Clustering Sorting Based on Feature Extraction and Feature Selection
    Wang, Shiqiang
    Gao, Caiyun
    Zhang, Qin
    Dakulagi, Veerendra
    Zeng, Huiyong
    Zheng, Guimei
    Bai, Juan
    Song, Yuwei
    Cai, Jiliang
    Zong, Binfeng
    IEEE ACCESS, 2020, 8 : 93322 - 93334
  • [26] A clustering-based feature selection via feature separability
    Jiang, Shengyi
    Wang, Lianxi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 927 - 937
  • [27] Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering
    Abualigah, Laith Mohammad
    Khader, Ahamad Tajudin
    Al-Betar, Mohammed Azmi
    Alomari, Osama Ahmad
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 84 : 24 - 36
  • [28] Chinese and English text classification techniques incorporating CHI feature selection for ELT cloud classroom
    Wei, Yufen
    OPEN COMPUTER SCIENCE, 2024, 14 (01):
  • [29] Curious Feature Selection-Based Clustering
    Moran M.
    Gordon G.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (12): : 6146 - 6158
  • [30] A Clustering Based Genetic Algorithm for Feature Selection
    Rostami, Mehrdad
    Moradi, Parham
    2014 6TH CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), 2014, : 112 - 116