Document Clustering Using Gravitational Ensemble Clustering

被引:0
作者
Sadeghian, Armindokht Hashempour [1 ]
Nezamabadi-pour, Hossein [1 ]
机构
[1] Shahid Bahonar Univ Kerman, Dept Elect Engn, Kerman, Iran
来源
2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP) | 2015年
关键词
Data mining; Data Clustering; Document clustering; Gravitational clustering; Quality measures;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Text Mining is a field that is considered as an extension of data mining. In the context of text mining, document clustering is used to set apart likewise documents of a collection into the identical category, called cluster, and divergent documents to distinctive groups. Since every dataset has its own characteristics, finding an appropriate clustering algorithm that can manage all kinds of clusters, is a big challenge. Clustering algorithms has theirs unique approaches for computing the number of clusters, imposing a structure on the data, and attesting the out coming clusters. The idea of combining different clustering is an effort to overwhelm the faults of single algorithms and further enhance their executions. On the other hand, inspired by the gravitational law, different clustering algorithms have been introduced that each one attempted to cluster complex datasets. Gravitational Ensemble Clustering (GEC) is an ensemble method that employs both the concepts of gravitational clustering and ensemble clustering to reach a better clustering result. This paper represents an application of GEC to the problem of document clustering. The proposed method uses a modification of the original GEC algorithm. This modification tries to produce a more varied clustering ensemble using new parameter setting. The GEC algorithm is assessed using document datasets. Promising results of the presented method were obtained in comparison with competing algorithms.
引用
收藏
页码:240 / 245
页数:6
相关论文
共 27 条
  • [1] [Anonymous], P SIAM INT C DAT MIN
  • [2] A statistics-based approach to control the quality of subclusters in incremental gravitational clustering
    Chen, CY
    Hwang, SC
    Oyang, YJ
    [J]. PATTERN RECOGNITION, 2005, 38 (12) : 2256 - 2269
  • [3] Efficient stochastic algorithms for document clustering
    Forsati, Rana
    Mahdavi, Mehrdad
    Shamsfard, Mehrnoush
    Meybodi, Mohammad Reza
    [J]. INFORMATION SCIENCES, 2013, 220 : 269 - 291
  • [4] Combining multiple clusterings using evidence accumulation
    Fred, ALN
    Jain, AK
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2005, 27 (06) : 835 - 850
  • [5] Distributed collaborative Web document clustering using cluster keyphrase summaries
    Hammouda, Khaled
    Kamel, Mohamed
    [J]. INFORMATION FUSION, 2008, 9 (04) : 465 - 480
  • [6] Hammouda K, 2006, SIAM PROC S, P453
  • [7] Generation of a clustering ensemble based on a gravitational self-organising map
    Ilc, Nejc
    Dobnikar, Andrej
    [J]. NEUROCOMPUTING, 2012, 96 : 47 - 56
  • [8] Data clustering: A review
    Jain, AK
    Murty, MN
    Flynn, PJ
    [J]. ACM COMPUTING SURVEYS, 1999, 31 (03) : 264 - 323
  • [9] Karypis G, 1997, DES AUT CON, P526, DOI 10.1145/266021.266273
  • [10] Enhanced bisecting k-means clustering using intermediate cooperation
    Kashef, R.
    Kamel, M. S.
    [J]. PATTERN RECOGNITION, 2009, 42 (11) : 2557 - 2569