Weighted consensus clustering and its application to Big data

被引:17
|
作者
Alguliyev, Rasim M. [1 ]
Aliguliyev, Ramiz M. [1 ]
Sukhostat, Lyudmila, V [1 ]
机构
[1] Azerbaijan Natl Acad Sci, Inst Informat Technol, 9A B Vahabzade St, AZ-1141 Baku, Azerbaijan
关键词
Weighted consensus clustering; Big data; Utility function; Purity-based utility function; Co-association matrix; ENSEMBLE; ALGORITHM; INDEXES;
D O I
10.1016/j.eswa.2020.113294
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The aim of this study is the development of a weighted consensus clustering that assigns weights to single clustering methods using the purity utility function. In the case of Big data that does not contain labels, the utility function based on the Davies-Bouldin index is proposed in this paper. The Banknote authentication, Phishing, Diabetic, Magic04, Credit card clients, Covertype, Phone accelerometer, and NSL-KDD datasets are used to assess the efficiency of the proposed consensus approach. The proposed approach is evaluated using the Euclidean, Minkowski, squared Euclidean, cosine, and Chebychev distance metrics. It is compared with single clustering algorithms (DBSCAN, OPTICS, CLARANS, k-means, and shared nearby neighbor clustering). The experimental results show the effectiveness of the proposed approach to the Big data clustering in comparison to single clustering methods. The proposed weighted consensus clustering using the squared Euclidean distance metric achieves the highest accuracy, which is a very promising result for Big data clustering. It can be applied to expert systems to help experts make group decisions based on several alternatives. The paper also provides directions for future research on consensus clustering in this area. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Strategies for Big Data Clustering
    Kurasova, Olga
    Marcinkevicius, Virginijus
    Medvedev, Viktor
    Rapecka, Aurimas
    Stefanovic, Pavel
    2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 740 - 747
  • [22] Big Data clustering validity
    Tlili, Monia
    Hamdani, Tarek M.
    2014 6TH INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR), 2014, : 348 - 352
  • [23] Effective feature representation using symbolic approach for classification and clustering of big data
    Lavanya, P. G.
    Kouser, K.
    Suresha, Mallappa
    EXPERT SYSTEMS WITH APPLICATIONS, 2021, 173
  • [24] Sorensen-Dice Similarity Indexing based Weighted Iterative Clustering for Big Data Analytics
    Annathurai, KalyanaSaravanan
    Angamuthu, Tamilarasi
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (01) : 11 - 22
  • [25] Deep Learning Model and Its Application in Big Data
    Zhou, Yuanming
    Zhao, Shifeng
    Wang, Xuesong
    Liu, Wei
    DESIGN, USER EXPERIENCE, AND USABILITY: THEORY AND PRACTICE, DUXU 2018, PT I, 2018, 10918 : 795 - 806
  • [26] OpenStack Platform and its Application in Big Data Processing
    Shao, Cen
    Liang, Bo
    Wang, Feng
    Deng, Hui
    Dai, Wei
    Wei, Shoulin
    Zhang, Xiaoli
    Yuan, Zhi
    2015 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT NETWORKS AND INTELLIGENT SYSTEMS (ICINIS), 2015, : 98 - 101
  • [27] Big data clustering with varied density based on MapReduce
    Heidari, Safanaz
    Alborzi, Mahmood
    Radfar, Reza
    Afsharkazemi, Mohammad Ali
    Ghatari, Ali Rajabzadeh
    JOURNAL OF BIG DATA, 2019, 6 (01)
  • [28] A Modified Hybrid Fuzzy Clustering Method for Big Data
    Khoshkbarchi, Amir
    Kamali, Ali
    Amjadi, Mehdi
    Haeri, Maryam Amir
    2016 8TH INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS (IST), 2016, : 196 - 201
  • [29] Parallel K-prototypes for Clustering Big Data
    Ben HajKacem, Mohamed Aymen
    Ben N'cir, Chiheb-Eddine
    Essoussi, Nadia
    COMPUTATIONAL COLLECTIVE INTELLIGENCE (ICCCI 2015), PT II, 2015, 9330 : 628 - 637
  • [30] Weighted Topological Clustering for Categorical Data
    Rogovschi, Nicoleta
    Nadif, Mohamed
    NEURAL INFORMATION PROCESSING, PT I, 2011, 7062 : 599 - +