Density-based clustering of big probabilistic graphs

被引:17
|
作者
Halim, Zahid [1 ]
Khattak, Jamal Hussain [1 ,2 ]
机构
[1] Ghulam Ishaq Khan Inst Engn Sci & Technol, Fac Comp Sci & Engn, Topi, Pakistan
[2] Allied Bank Ltd, Informat Technol Grp, Business Solut & Dev, Lahore, Pakistan
关键词
Clustering graphs; Machine learning; Big graphs; Clustering; Community detection; UNCERTAIN DATA; ALGORITHM;
D O I
10.1007/s12530-018-9223-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a machine learning task to group similar objects in coherent sets. These groups exhibit similar behavior with-in their cluster. With the exponential increase in the data volume, robust approaches are required to process and extract clusters. In addition to large volumes, datasets may have uncertainties due to the heterogeneity of the data sources, resulting in the Big Data. Modern approaches and algorithms in machine learning widely use probability-theory in order to determine the data uncertainty. Such huge uncertain data can be transformed to a probabilistic graph-based representation. This work presents an approach for density-based clustering of big probabilistic graphs. The proposed approach deals with clustering of large probabilistic graphs using the graph's density, where the clustering process is guided by the nodes' degree and the neighborhood information. The proposed approach is evaluated using seven real-world benchmark datasets, namely protein-to-protein interaction, yahoo, movie-lens, core, last.fm, delicious social bookmarking system, and epinions. These datasets are first transformed to a graph-based representation before applying the proposed clustering algorithm. The obtained results are evaluated using three cluster validation indices, namely Davies-Bouldin index, Dunn index, and Silhouette coefficient. This proposal is also compared with four state-of-the-art approaches for clustering large probabilistic graphs. The results obtained using seven datasets and three cluster validity indices suggest better performance of the proposed approach.
引用
收藏
页码:333 / 350
页数:18
相关论文
共 50 条
  • [1] Density-based clustering of big probabilistic graphs
    Zahid Halim
    Jamal Hussain Khattak
    Evolving Systems, 2019, 10 : 333 - 350
  • [2] Density-based Probabilistic Clustering of Uncertain Moving Objects
    Xu, Huajie
    Hu, Xiaoming
    Yang, Bing
    Xu, Juan
    2009 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND INTELLIGENT SYSTEMS, PROCEEDINGS, VOL 1, 2009, : 847 - +
  • [3] Density-based clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Sander, Joerg
    Zimek, Arthur
    WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (03) : 231 - 240
  • [4] Novel density-based and hierarchical density-based clustering algorithms for uncertain data
    Zhang, Xianchao
    Liu, Han
    Zhang, Xiaotong
    NEURAL NETWORKS, 2017, 93 : 240 - 255
  • [5] Stability of Density-Based Clustering
    Rinaldo, Alessandro
    Singh, Aarti
    Nugent, Rebecca
    Wasserman, Larry
    JOURNAL OF MACHINE LEARNING RESEARCH, 2012, 13 : 905 - 948
  • [6] Clustering with Missing Features: A Density-Based Approach
    Gao, Kun
    Khan, Hassan Ali
    Qu, Wenwen
    SYMMETRY-BASEL, 2022, 14 (01):
  • [7] Parallel Image Scaling Density-based Clustering
    Bi, Wenhao
    Zhang, An
    Gao, Fei
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 2084 - 2091
  • [8] An Attempt at Improving Density-based Clustering Algorithms
    Brown, Daniel
    Japa, Arialdis
    Shi, Yong
    PROCEEDINGS OF THE 2019 ANNUAL ACM SOUTHEAST CONFERENCE (ACMSE 2019), 2019, : 172 - 175
  • [9] CoExDBSCAN: Density-based Clustering with Constrained Expansion
    Ertl, Benjamin
    Meyer, Joerg
    Schneider, Matthias
    Streit, Achim
    PROCEEDINGS OF THE 12TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (KDIR), VOL 1, 2020, : 104 - 115
  • [10] Cludoop: An Efficient Distributed Density-Based Clustering for Big Data Using Hadoop
    Yu, Yanwei
    Zhao, Jindong
    Wang, Xiaodong
    Wang, Qin
    Zhang, Yonggang
    INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS, 2015,