Weighted consensus clustering and its application to Big data

被引:17
|
作者
Alguliyev, Rasim M. [1 ]
Aliguliyev, Ramiz M. [1 ]
Sukhostat, Lyudmila, V [1 ]
机构
[1] Azerbaijan Natl Acad Sci, Inst Informat Technol, 9A B Vahabzade St, AZ-1141 Baku, Azerbaijan
关键词
Weighted consensus clustering; Big data; Utility function; Purity-based utility function; Co-association matrix; ENSEMBLE; ALGORITHM; INDEXES;
D O I
10.1016/j.eswa.2020.113294
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of this study is the development of a weighted consensus clustering that assigns weights to single clustering methods using the purity utility function. In the case of Big data that does not contain labels, the utility function based on the Davies-Bouldin index is proposed in this paper. The Banknote authentication, Phishing, Diabetic, Magic04, Credit card clients, Covertype, Phone accelerometer, and NSL-KDD datasets are used to assess the efficiency of the proposed consensus approach. The proposed approach is evaluated using the Euclidean, Minkowski, squared Euclidean, cosine, and Chebychev distance metrics. It is compared with single clustering algorithms (DBSCAN, OPTICS, CLARANS, k-means, and shared nearby neighbor clustering). The experimental results show the effectiveness of the proposed approach to the Big data clustering in comparison to single clustering methods. The proposed weighted consensus clustering using the squared Euclidean distance metric achieves the highest accuracy, which is a very promising result for Big data clustering. It can be applied to expert systems to help experts make group decisions based on several alternatives. The paper also provides directions for future research on consensus clustering in this area. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Weighted consensus clustering and its application to Big data
    Alguliyev, Rasim M.
    Aliguliyev, Ramiz M.
    Sukhostat, Lyudmila V.
    Expert Systems with Applications, 2021, 150
  • [2] Fuzzy Consensus Clustering With Applications on Big Data
    Wu, Junjie
    Wu, Zhiang
    Cao, Jie
    Liu, Hongfu
    Chen, Guoqing
    Zhang, Yanchun
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2017, 25 (06) : 1430 - 1445
  • [3] Sampling-Based Consensus Fuzzy Clustering on Big Data
    Zoghlami, Mohamed Ali
    Sassi Hidri, Minyar
    Ben Ayed, Rahma
    2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 1501 - 1508
  • [4] A hierarchical fuzzy cluster ensemble approach and its application to big data clustering
    Su, Pan
    Shang, Changjing
    Shen, Qiang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2015, 28 (06) : 2409 - 2421
  • [5] Scalable incremental fuzzy consensus clustering algorithm for handling big data
    Jha, Preeti
    Tiwari, Aruna
    Bharill, Neha
    Ratnaparkhe, Milind
    Nagendra, Neha
    Mounika, Mukkamalla
    SOFT COMPUTING, 2021, 25 (13) : 8703 - 8719
  • [6] Big Data Clustering: A Review
    Shirkhorshidi, Ali Seyed
    Aghabozorgi, Saeed
    Teh, Ying Wah
    Herawan, Tutut
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2014, PT V, 2014, 8583 : 707 - 720
  • [7] MapReduce Clustering for Big Data
    Ghattas, Badih
    Pinto, Antoine
    Diao, Sambou
    2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5116 - 5124
  • [8] Scalable incremental fuzzy consensus clustering algorithm for handling big data
    Preeti Jha
    Aruna Tiwari
    Neha Bharill
    Milind Ratnaparkhe
    Neha Nagendra
    Mukkamalla Mounika
    Soft Computing, 2021, 25 : 8703 - 8719
  • [9] Clustering Application for Streaming Big Data in Smart Grid
    Banga, Alisha
    Sinha, Amrita
    PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2018, : 1051 - 1054
  • [10] Weighted z-Distance-Based Clustering and Its Application to Time-Series Data
    Wang, Zhao-Yu
    Wu, Chen-Yu
    Lin, Yan-Ting
    Lee, Shie-Jue
    APPLIED SCIENCES-BASEL, 2019, 9 (24):