Apache Mahout's k-Means vs. Fuzzy k-Means Performance Evaluation

被引:0
作者
Xhafa, Fatos [1 ,3 ]
Bogza, Adriana [1 ]
Caballe, Santi [2 ]
Barolli, Leonard
机构
[1] Univ Politecn Cataluna, Barcelona, Spain
[2] Univ Oberta Catalunya, Barcelona, Spain
[3] Univ Oberta Catalunya, SmartLearn Grp, Barcelona, Spain
来源
2016 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKING AND COLLABORATIVE SYSTEMS (INCOS) | 2016年
关键词
Data Mining Algorithms; Apache Mahout; Big Data; k-Means; Fuzzy k-Means; Performance; Hadoop Cluster;
D O I
10.1109/INCoS.2016.103
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The emergence of the Big Data as a disruptive technology for next generation of intelligent systems, has brought many issues of how to extract and make use of the knowledge obtained from the data within short times, limited budget and under high rates of data generation. The foremost challenge identified here is the data processing, and especially, mining and analysis for knowledge extraction. As the old data mining frameworks were designed without Big Data requirements, a new generation of such frameworks is being developed fully implemented in Cloud platforms. One such frameworks is Apache Mahout aimed to leverage fast processing and analysis of Big Data. The performance of such new data mining frameworks is yet to be evaluated and potential limitations are to be revealed. In this paper we analyse the performance of Apache Mahout using large real data sets from the Twitter stream. We exemplify the analysis for the case of two clustering algorithms, namely, k-Means and Fuzzy k-Means, using a Hadoop cluster infrastructure for the experimental study.
引用
收藏
页码:110 / 116
页数:7
相关论文
共 50 条
  • [31] Statistically Improving K-means Clustering Performance
    Ihsanoglu, Abdullah
    Zaval, Mounes
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [32] Improving the performance of k-means for color quantization
    Celebi, M. Emre
    IMAGE AND VISION COMPUTING, 2011, 29 (04) : 260 - 271
  • [33] The kernel rough k-means algorithm
    Meng W.
    Hongyan D.
    Shiyuan Z.
    Zhankui D.
    Zige W.
    Recent Advances in Computer Science and Communications, 2020, 13 (02) : 234 - 239
  • [34] Sparse probabilistic K-means
    Jung, Yoon Mo
    Whang, Joyce Jiyoung
    Yun, Sangwoon
    APPLIED MATHEMATICS AND COMPUTATION, 2020, 382
  • [35] Vectorized Implementation of K-means
    Otsuka, Tomoki
    Fukushima, Norishige
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2021, 2021, 11766
  • [36] Adaptive Graph K-Means
    Pei, Shenfei
    Sun, Yuanchen
    Nie, Feiping
    Jiang, Xudong
    Zheng, Zengwei
    PATTERN RECOGNITION, 2025, 161
  • [37] Discriminative projection fuzzy K-Means with adaptive neighbors
    Wang, Jingyu
    Wang, Yidi
    Nie, Feiping
    Li, Xuelong
    PATTERN RECOGNITION LETTERS, 2023, 176 : 21 - 27
  • [38] Robust trimmed k-means
    Dorabiala, Olga
    Kutz, J. Nathan
    Aravkin, Aleksandr Y.
    PATTERN RECOGNITION LETTERS, 2022, 161 : 9 - 16
  • [39] An Improved K-Means Algorithm Based on Fuzzy Metrics
    Geng, Xinyu
    Mu, Yukun
    Mao, Senlin
    Ye, Jinchi
    Zhu, Liping
    IEEE ACCESS, 2020, 8 (08): : 217416 - 217424
  • [40] Transformed K-means Clustering
    Goel, Anurag
    Majumdar, Angshul
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1526 - 1530