Strategies for Big Data Clustering

被引:28
作者
Kurasova, Olga [1 ]
Marcinkevicius, Virginijus [1 ]
Medvedev, Viktor [1 ]
Rapecka, Aurimas [1 ]
Stefanovic, Pavel [1 ]
机构
[1] Vilnius State Univ, Inst Math & Informat, LT-08663 Vilnius, Lithuania
来源
2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI) | 2014年
关键词
big data; clustering methods; data mining; Hadoop; VISUAL ANALYSIS;
D O I
10.1109/ICTAI.2014.115
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the paper, an overview of methods and technologies used for big data clustering is presented. The clustering is one of the important data mining issue especially for big data analysis, where large volume data should be grouped. Here some clustering methods are described, great attention is paid to the k-means method and its modifications, because it still remains one of the popular methods and is implemented in innovative technologies for big data analysis. Neural network-based self-organizing maps and their extensions for big data clustering are reviewed, too. Some strategies for big data clustering are also presented and discussed. It is shown the data of which volume can be clustered in the well known data mining systems WEKA and KNIME and when new sophisticated technologies are needed.
引用
收藏
页码:740 / 747
页数:8
相关论文
共 50 条
  • [1] Clustering on Big Data Using Hadoop MapReduce
    Akthar, Nadeem
    Ahamad, Mohd Vasim
    Khan, Shahbaz
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 789 - 795
  • [2] A survey on parallel clustering algorithms for Big Data
    Zineb Dafir
    Yasmine Lamari
    Said Chah Slaoui
    Artificial Intelligence Review, 2021, 54 : 2411 - 2443
  • [3] Scalable Clustering Algorithms for Big Data: A Review
    Mahdi, Mahmoud A.
    Hosny, Khalid M.
    Elhenawy, Ibrahim
    IEEE ACCESS, 2021, 9 : 80015 - 80027
  • [4] A survey on parallel clustering algorithms for Big Data
    Dafir, Zineb
    Lamari, Yasmine
    Slaoui, Said Chah
    ARTIFICIAL INTELLIGENCE REVIEW, 2021, 54 (04) : 2411 - 2443
  • [5] A Survey of Clustering Techniques for Big Data Analysis
    Arora, Saurabh
    Chana, Inderveer
    2014 5TH INTERNATIONAL CONFERENCE CONFLUENCE THE NEXT GENERATION INFORMATION TECHNOLOGY SUMMIT (CONFLUENCE), 2014, : 59 - 65
  • [6] Iterative big data clustering algorithms: a review
    Mohebi, Amin
    Aghabozorgi, Saeed
    Teh Ying Wah
    Herawan, Tutut
    Yahyapour, Ramin
    SOFTWARE-PRACTICE & EXPERIENCE, 2016, 46 (01) : 107 - 129
  • [7] An Efficient Clustering Technique for Big Data Mining
    Banait, Satish S.
    Sane, S. S.
    Talekar, Sopan A.
    INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2022, 13 (03): : 702 - 717
  • [8] A Novel Clustering Technique for Efficient Clustering of Big Data in Hadoop Ecosystem
    Kumar, Sunil
    Singh, Maninder
    BIG DATA MINING AND ANALYTICS, 2019, 2 (04): : 240 - 247
  • [9] Continuous Clustering in Big Data Learning Analytics
    Govindarajan, Kannan
    Somasundaram, Thamarai Selvi
    Kumar, Vivekanandan S.
    Kinshuk
    2013 IEEE FIFTH INTERNATIONAL CONFERENCE ON TECHNOLOGY FOR EDUCATION (T4E 2013), 2013, : 61 - 64
  • [10] A Novel Clustering Technique for Efficient Clustering of Big Data in Hadoop Ecosystem
    Sunil Kumar
    Maninder Singh
    Big Data Mining and Analytics, 2019, 2 (04) : 240 - 247