Density-based clustering of big probabilistic graphs

被引:17
|
作者
Halim, Zahid [1 ]
Khattak, Jamal Hussain [1 ,2 ]
机构
[1] Ghulam Ishaq Khan Inst Engn Sci & Technol, Fac Comp Sci & Engn, Topi, Pakistan
[2] Allied Bank Ltd, Informat Technol Grp, Business Solut & Dev, Lahore, Pakistan
关键词
Clustering graphs; Machine learning; Big graphs; Clustering; Community detection; UNCERTAIN DATA; ALGORITHM;
D O I
10.1007/s12530-018-9223-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a machine learning task to group similar objects in coherent sets. These groups exhibit similar behavior with-in their cluster. With the exponential increase in the data volume, robust approaches are required to process and extract clusters. In addition to large volumes, datasets may have uncertainties due to the heterogeneity of the data sources, resulting in the Big Data. Modern approaches and algorithms in machine learning widely use probability-theory in order to determine the data uncertainty. Such huge uncertain data can be transformed to a probabilistic graph-based representation. This work presents an approach for density-based clustering of big probabilistic graphs. The proposed approach deals with clustering of large probabilistic graphs using the graph's density, where the clustering process is guided by the nodes' degree and the neighborhood information. The proposed approach is evaluated using seven real-world benchmark datasets, namely protein-to-protein interaction, yahoo, movie-lens, core, last.fm, delicious social bookmarking system, and epinions. These datasets are first transformed to a graph-based representation before applying the proposed clustering algorithm. The obtained results are evaluated using three cluster validation indices, namely Davies-Bouldin index, Dunn index, and Silhouette coefficient. This proposal is also compared with four state-of-the-art approaches for clustering large probabilistic graphs. The results obtained using seven datasets and three cluster validity indices suggest better performance of the proposed approach.
引用
收藏
页码:333 / 350
页数:18
相关论文
共 50 条
  • [41] A density-based fuzzy exemplar clustering algorithm
    Zhou J.
    Jiang Z.-B.
    Zhang Y.-P.
    Wang S.-T.
    Kongzhi yu Juece/Control and Decision, 2020, 35 (05): : 1123 - 1133
  • [42] dbscan: Fast Density-Based Clustering with R
    Hahsler, Michael
    Piekenbrock, Matthew
    Doran, Derek
    JOURNAL OF STATISTICAL SOFTWARE, 2019, 91 (01): : 1 - 30
  • [43] Unifying Density-Based Clustering and Outlier Detection
    Tao, Yunxin
    Pi, Dechang
    WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, : 644 - 647
  • [44] Density-based Clustering using Automatic Density Peak Detection
    Yan, Huanqian
    Lu, Yonggang
    Ma, Heng
    PROCEEDINGS OF THE 7TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION APPLICATIONS AND METHODS (ICPRAM 2018), 2018, : 95 - 102
  • [45] An incremental density-based clustering framework using fuzzy local clustering
    Laohakiat, Sirisup
    Sa-ing, Vera
    INFORMATION SCIENCES, 2021, 547 : 404 - 426
  • [46] Density-based clustering on massive mobile communication data
    Liu, YF
    Tang, SW
    Yang, DQ
    Chen, Y
    Wang, TJ
    Ma, S
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XI, PROCEEDINGS: COMMUNICATION, NETWORK AND CONTROL SYSTEMS, TECHNOLOGIES AND APPLICATIONS: II, 2003, : 251 - 254
  • [47] Density-Based Clustering of Data Streams at Multiple Resolutions
    Wan, Li
    Ng, Wee Keong
    Dang, Xuan Hong
    Yu, Philip S.
    Zhang, Kuan
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2009, 3 (03)
  • [48] Spectral density-based clustering algorithms for complex networks
    Ramos, Taiane Coelho
    Mourao-Miranda, Janaina
    Fujita, Andre
    FRONTIERS IN NEUROSCIENCE, 2023, 17
  • [49] Density-based user clustering in downlink NOMA systems
    Hanliang You
    Yaoyue Hu
    Zhiwen Pan
    Nan Liu
    Science China Information Sciences, 2022, 65
  • [50] Variable Neighborhood Search for Automatic Density-Based Clustering
    Boudane, Fatima
    Berrichi, Ali
    PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON MATHEMATICS AND INFORMATION TECHNOLOGY (ICMIT), 2017, : 141 - 147