Density-based clustering of big probabilistic graphs

被引:17
|
作者
Halim, Zahid [1 ]
Khattak, Jamal Hussain [1 ,2 ]
机构
[1] Ghulam Ishaq Khan Inst Engn Sci & Technol, Fac Comp Sci & Engn, Topi, Pakistan
[2] Allied Bank Ltd, Informat Technol Grp, Business Solut & Dev, Lahore, Pakistan
关键词
Clustering graphs; Machine learning; Big graphs; Clustering; Community detection; UNCERTAIN DATA; ALGORITHM;
D O I
10.1007/s12530-018-9223-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a machine learning task to group similar objects in coherent sets. These groups exhibit similar behavior with-in their cluster. With the exponential increase in the data volume, robust approaches are required to process and extract clusters. In addition to large volumes, datasets may have uncertainties due to the heterogeneity of the data sources, resulting in the Big Data. Modern approaches and algorithms in machine learning widely use probability-theory in order to determine the data uncertainty. Such huge uncertain data can be transformed to a probabilistic graph-based representation. This work presents an approach for density-based clustering of big probabilistic graphs. The proposed approach deals with clustering of large probabilistic graphs using the graph's density, where the clustering process is guided by the nodes' degree and the neighborhood information. The proposed approach is evaluated using seven real-world benchmark datasets, namely protein-to-protein interaction, yahoo, movie-lens, core, last.fm, delicious social bookmarking system, and epinions. These datasets are first transformed to a graph-based representation before applying the proposed clustering algorithm. The obtained results are evaluated using three cluster validation indices, namely Davies-Bouldin index, Dunn index, and Silhouette coefficient. This proposal is also compared with four state-of-the-art approaches for clustering large probabilistic graphs. The results obtained using seven datasets and three cluster validity indices suggest better performance of the proposed approach.
引用
收藏
页码:333 / 350
页数:18
相关论文
共 50 条
  • [21] Fast density estimation for density-based clustering methods
    Cheng, Difei
    Xu, Ruihang
    Zhang, Bo
    Jin, Ruinan
    NEUROCOMPUTING, 2023, 532 : 170 - 182
  • [22] Clustering probabilistic graphs using neighbourhood paths
    Hussain, Syed Fawad
    Maab, Iffat
    INFORMATION SCIENCES, 2021, 568 : 216 - 238
  • [23] DBHD: Density-based clustering for highly varying density
    Durani, Walid
    Mautz, Dominik
    Plant, Claudia
    Boehm, Christian
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 921 - 926
  • [24] A Density-based Clustering Approach for Monitoring of Injection Moulding Machine
    Theljani, Foued
    Belkadi, Adel
    Billaudel, Patrice
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2021, 19 (07) : 2583 - 2595
  • [25] Robustness of density-based clustering methods with various neighborhood relations
    Nasibov, Efendi N.
    Ulutagay, Goezde
    FUZZY SETS AND SYSTEMS, 2009, 160 (24) : 3601 - 3615
  • [26] A Clustering Density-Based Sample Reduction Method
    Mohammadi, Mahdi
    Raahemi, Bijan
    Akbari, Ahmad
    ADVANCES IN ARTIFICIAL INTELLIGENCE, CANADIAN AI 2014, 2014, 8436 : 319 - 325
  • [27] Anytime density-based clustering of complex data
    Mai, Son T.
    He, Xiao
    Feng, Jing
    Plant, Claudia
    Boehm, Christian
    KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (02) : 319 - 355
  • [28] PARDICLE: Parallel Approximate Density-based Clustering
    Patwary, Md. Mostofa Ali
    Satish, Nadathur
    Sundaram, Narayanan
    Manne, Fredrik
    Habib, Salman
    Dubey, Pradeep
    SC14: INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2014, : 560 - 571
  • [29] Mining Stable Communities in Temporal Networks by Density-Based Clustering
    Qin, Hongchao
    Li, Rong-Hua
    Wang, Guoren
    Huang, Xin
    Yuan, Ye
    Yu, Jeffrey Xu
    IEEE TRANSACTIONS ON BIG DATA, 2022, 8 (03) : 671 - 684
  • [30] Privacy-preserving Density-based Clustering
    Bozdemir, Beyza
    Canard, Sebastien
    Ermis, Orhan
    Moellering, Helen
    Onen, Melek
    Schneider, Thomas
    ASIA CCS'21: PROCEEDINGS OF THE 2021 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2021, : 658 - 671