Consensus Big Data Clustering for Bayesian Mixture Models

被引:3
|
作者
Karras, Christos [1 ]
Karras, Aristeidis [1 ]
Giotopoulos, Konstantinos C. [2 ]
Avlonitis, Markos [3 ]
Sioutas, Spyros [1 ]
机构
[1] Univ Patras, Comp Engn & Informat Dept, Patras 26504, Greece
[2] Univ Patras, Dept Management Sci & Technol, Patras 26334, Greece
[3] Ionian Univ, Dept Informat, Kerkira 49100, Greece
关键词
stochastic data engineering; cluster analysis; Bayesian mixture modelling; consensus clustering; big-data management and analytics; NUMBER;
D O I
10.3390/a16050245
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the context of big-data analysis, the clustering technique holds significant importance for the effective categorization and organization of extensive datasets. However, pinpointing the ideal number of clusters and handling high-dimensional data can be challenging. To tackle these issues, several strategies have been suggested, such as a consensus clustering ensemble that yields more significant outcomes compared to individual models. Another valuable technique for cluster analysis is Bayesian mixture modelling, which is known for its adaptability in determining cluster numbers. Traditional inference methods such as Markov chain Monte Carlo may be computationally demanding and limit the exploration of the posterior distribution. In this work, we introduce an innovative approach that combines consensus clustering and Bayesian mixture models to improve big-data management and simplify the process of identifying the optimal number of clusters in diverse real-world scenarios. By addressing the aforementioned hurdles and boosting accuracy and efficiency, our method considerably enhances cluster analysis. This fusion of techniques offers a powerful tool for managing and examining large and intricate datasets, with possible applications across various industries.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Consensus clustering for Bayesian mixture models
    Coleman, Stephen
    Kirk, Paul D. W.
    Wallace, Chris
    BMC BIOINFORMATICS, 2022, 23 (01)
  • [2] Consensus clustering for Bayesian mixture models
    Stephen Coleman
    Paul D. W. Kirk
    Chris Wallace
    BMC Bioinformatics, 23
  • [3] Consensus Clustering on Big Data
    Liu, Hongfu
    Cheng, Gong
    Wu, Junjie
    2015 12TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT (ICSSSM), 2015,
  • [4] Bayesian Mixture Models with Focused Clustering for Mixed Ordinal and Nominal Data
    DeYoreo, Maria
    Reiter, Jerome P.
    Hillygus, D. Sunshine
    BAYESIAN ANALYSIS, 2017, 12 (03): : 679 - 703
  • [5] Bayesian curve fitting and clustering with Dirichlet process mixture models for microarray data
    Ju-Hyun Park
    Minjung Kyung
    Journal of the Korean Statistical Society, 2019, 48 : 207 - 220
  • [6] Bayesian curve fitting and clustering with Dirichlet process mixture models for microarray data
    Park, Ju-Hyun
    Kyung, Minjung
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2019, 48 (02) : 207 - 220
  • [7] Bayesian consensus clustering for multivariate longitudinal data
    Lu, Zihang
    Lou, Wendy
    STATISTICS IN MEDICINE, 2022, 41 (01) : 108 - 127
  • [8] Fuzzy Consensus Clustering With Applications on Big Data
    Wu, Junjie
    Wu, Zhiang
    Cao, Jie
    Liu, Hongfu
    Chen, Guoqing
    Zhang, Yanchun
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2017, 25 (06) : 1430 - 1445
  • [9] A Bayesian mixture model for clustering circular data
    Rodriguez, Carlos E.
    Nunez-Antonio, Gabriel
    Escarela, Gabriel
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 143
  • [10] Bayesian Mixture of AR Models for Time Series Clustering
    Venkatararnana, Kini B.
    Sekhar, C. Chandra
    ICAPR 2009: SEVENTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, PROCEEDINGS, 2009, : 35 - 38