Text clustering based on kernel KNN clustering algorithm

被引:0
|
作者
Xiong, Hao [1 ]
Sun, Sheng [1 ]
Feng, Yunfang [1 ]
机构
[1] Computer School, Hubei Polytechnic University, Huangshi 435003, Hubei, China
关键词
Attribute selection - Collection of documents - Document Clustering - Higher-dimensional - K-nearest neighbors - Kernel methods - Nonlinear functions - Text Clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Document clustering is a popular tool for automatically organizing a large collection of documents. In this paper, we propose a Kernel-based K-Nearest Neighbor (KKNNC) clustering algorithm based on the KNN method. Our algorithm maps samples into a higher-dimensional feature space using a nonlinear function before clustering, then in kernel space divides them linearly. We also propose a new attribute selection method-ATS??algorithm, which can select important terms in documents. Our algorithm first uses ATS to eliminate redundant attributes in data sets, then gives each of the selective attributes a weight value according to the relationship between these attributes. The experimental results show that our algorithm is effective in the text clustering task. © 2013 by CESER Publications.
引用
收藏
页码:69 / 75
相关论文
共 50 条
  • [31] Mutual kNN based spectral clustering
    Malong Tan
    Shichao Zhang
    Lin Wu
    Neural Computing and Applications, 2020, 32 : 6435 - 6442
  • [32] Genetic algorithm-based text clustering technique
    Song, Wei
    Park, Soon Cheol
    ADVANCES IN NATURAL COMPUTATION, PT 1, 2006, 4221 : 779 - 782
  • [33] A Text Hybrid Clustering Algorithm Based on HowNet Semantics
    Zhu, Zheng-yu
    Dong, Shu-jia
    Yu, Chun-lei
    He, Jie
    ADVANCED MATERIALS AND COMPUTER SCIENCE, PTS 1-3, 2011, 474-476 : 2071 - 2078
  • [34] IMPROVED GA-BASED TEXT CLUSTERING ALGORITHM
    Shi, Kansheng
    Li, Lemin
    He, Jie
    Zhang, Naitong
    Liu, Haitao
    Song, Wentao
    2011 4TH IEEE INTERNATIONAL CONFERENCE ON BROADBAND NETWORK AND MULTIMEDIA TECHNOLOGY (4TH IEEE IC-BNMT2011), 2011, : 675 - +
  • [35] An Incremental Algorithm of Text Clustering Based on Semantic Sequences
    FENG Zhonghui
    WuhanUniversityJournalofNaturalSciences, 2006, (05) : 1340 - 1344
  • [36] A Text Clustering Algorithm Based on Simplified Cluster Hypothesis
    Sun Yuan
    Guo Wenbin
    2013 2ND INTERNATIONAL SYMPOSIUM ON INSTRUMENTATION AND MEASUREMENT, SENSOR NETWORK AND AUTOMATION (IMSNA), 2013, : 412 - 415
  • [37] A New Web Text Clustering Algorithm Based on DFSSM
    Yang, Bingru
    Song, Zefeng
    Wang, Yinglong
    Song, Wei
    PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON ELECTRONIC COMMERCE AND SECURITY, 2008, : 27 - 32
  • [38] A new algorithm for text clustering based on projection pursuit
    Gao, Mao-Ting
    Wang, Zheng-Ou
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 3401 - +
  • [39] A Novel Fuzzy Based Clustering Algorithm for Text Classification
    Mohan, A. Krishna
    Rao, V. V. Narasimha
    Prasad, M. H. M. Krishna
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2013, 13 (05): : 100 - 107
  • [40] A text clustering algorithm based on find of density peaks
    Liu, Peiyu
    Liu, Yingying
    Hou, Xiuyan
    Li, Qingqing
    Zhu, Zhenfang
    2015 7TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY IN MEDICINE AND EDUCATION (ITME), 2015, : 348 - 352