Comparative Study of Clustering Algorithms in Text Mining Context

被引:10
作者
Jalil, Abdennour Mohamed [1 ]
Hafidi, Imad [1 ]
Alami, Lamiae [1 ]
Ensa, Khouribga [1 ]
机构
[1] Lab IPOSI, Denver, CO 80204 USA
来源
INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE | 2016年 / 3卷 / 07期
关键词
Algorithms; Clustering; Data; Text Mining;
D O I
10.9781/ijimai.2016.376
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spectacular increasing of Data is due to the appearance of networks and smartphones. Amount 42% of world population using internet [1]; have created a problem related of the processing of the data exchanged, which is rising exponentially and that should be automatically treated. This paper presents a classical process of knowledge discovery databases, in order to treat textual data. This process is divided into three parts: preprocessing, processing and post-processing. In the processing step, we present a comparative study between several clustering algorithms such as KMeans, Global KMeans, Fast Global KMeans, Two Level KMeans and FWKmeans. The comparison between these algorithms is made on real textual data from the web using RSS feeds. Experimental results identified two problems: the first one quality results which remain for algorithms, which rapidly converge. The second problem is due to the execution time that needs to decrease for some algorithms.
引用
收藏
页码:42 / 45
页数:4
相关论文
共 14 条
  • [1] [Anonymous], 2005, KNOWLEDGE DISCOVERY
  • [2] Azzopardi Joel, 2012, INCREMENTAL CLUSTERI
  • [3] Chitta Radha, 2010, JOURNALPATTERN RECOG, V43, P796
  • [4] Davies, 2012, IEE T PATTERN MACHIN, VPAMI-1, P224
  • [5] IBEKWE-SANJUAN Fidelia, 2007, INGENIERIE LINGUISTI
  • [6] Jing Liping, 2008, WORLD ACAD SCI ENG T, V2
  • [7] Kaufman L., 1987, Statistical Data Analysis Based on the L1-Norm and Related Methods. First International Conference, P405
  • [8] A fuzzy c-means bi-sonar-based Metaheuristic Optimization Algorithm
    Khan, Koffka
    Sahai, Ashok
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2012, 1 (07): : 26 - 32
  • [9] Laia Jim Z. C., 2009, FAST GLOBAL KMEANS C
  • [10] LIKAS A, 2003, GLOBAL K MEANS CLUST