Enhanced Bayesian Gaussian hidden Markov mixture clustering for improved knowledge discovery

被引:0
|
作者
Ganesan, Anusha [1 ]
Paul, Anand [2 ]
Kim, Sungho [1 ]
机构
[1] Yeungnam Univ, Dept Elect Engn, Gyongsan 38541, South Korea
[2] Louisiana State Univ, Dept Biostat & Data Sci, Hlth Sci Ctr, New Orleans, LA 70112 USA
基金
新加坡国家研究基金会;
关键词
Baum-Welch algorithm; Bayesian; Bayesian Gaussian mixture model; Clustering; Cross-validation; Distance metric; Gaussian; Hidden Markov model; Mixture models; Viterbi algorithm; MODELS; HMM;
D O I
10.1007/s10044-024-01374-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The hidden Markov model (HMM) is widely utilized in natural language processing, speech recognition, autonomous vehicular systems, and healthcare for tasks such as clustering, pattern recognition, predictive modeling, anomaly detection, and time-series forecasting. However, HMMs can be sensitive to initial states, compromising clustering reliability. To address this issue, we propose an innovative integration of an HMM with hybrid distance metric learning and a modified Bayesian Gaussian mixture model (BGMM) to enhance clustering performance and robustness. A significant challenge in HMM applications is determining the optimal number of hidden states. We address this using a k-fold cross-validation strategy. Implementing our Bayesian Gaussian Hidden Markov Mixture Clustering Model (BGH2MCM) on five diverse datasets, we categorize the observed data sequences according to underlying hidden state sequences. This approach yields superior outcomes to conventional techniques such as K-means, agglomerative clustering, density-based spatial clustering of applications with noise (DBSCAN), and the BGMM. We evaluate the efficiency of our model using silhouette, Davies-Bouldin, and Calinski-Harabasz scores, accuracy metrics, and computation time. Our results demonstrate that the BGH2MCM consistently achieves better clustering quality and computational efficiency, showing an average computation time 23% lower than agglomerative clustering with HMM, 22% less than DBSCAN with HMM, and 14% lower than K-means with the HMM and a BGMM-HMM across all datasets. This study highlights the potential of our BGH2MCM to improve data mining and knowledge discovery practices from complex, real-world datasets.
引用
收藏
页数:16
相关论文
共 33 条
  • [1] Clustering Hidden Markov Models With Variational Bayesian Hierarchical EM
    Lan, Hui
    Liu, Ziquan
    Hsiao, Janet H.
    Yu, Dan
    Chan, Antoni B.
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (03) : 1537 - 1551
  • [2] A survey of feature selection methods for Gaussian mixture models and hidden Markov models
    Adams, Stephen
    Beling, Peter A.
    ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (03) : 1739 - 1779
  • [3] An improved hidden Markov model with magnetic Barkhausen noise and optimized Gaussian mixture feature for fatigue prediction
    Li, Xiang
    Guo, Wei
    Deng, Xin
    Guo, Yitong
    Zheng, Yang
    Zhou, Jinjie
    Zhan, Peng
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (08)
  • [4] Bayesian clustering for continuous-time hidden Markov models
    Luo, Yu
    Stephens, David A.
    Buckeridge, David L.
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2023, 51 (01): : 134 - 156
  • [5] Bayesian Subspace Hidden Markov Model for Acoustic Unit Discovery
    Ondel, Lucas
    Vydana, Hari Krishna
    Burget, Lukas
    Cernocky, Jan
    INTERSPEECH 2019, 2019, : 261 - 265
  • [6] TOWARD ROBUST LEARNING OF THE GAUSSIAN MIXTURE STATE EMISSION DENSITIES FOR HIDDEN MARKOV MODELS
    Tang, Hao
    Hasegawa-Johnson, Mark
    Huang, Thomas S.
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5242 - 5245
  • [7] STRANDED GAUSSIAN MIXTURE HIDDEN MARKOV MODELS FOR ROBUST SPEECH RECOGNITION
    Zhao, Yong
    Juang, Biing-Hwang
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4301 - 4304
  • [8] Bayesian Sparse Gaussian Mixture Model for Clustering in High Dimensions
    Yao, Dapeng
    Xie, Fangzheng
    Xu, Yanxun
    JOURNAL OF MACHINE LEARNING RESEARCH, 2025, 26 : 1 - 50
  • [9] Deep Gaussian Mixture-Hidden Markov Model for Classification of EEG Signals
    Wang, Min
    Abdelfattah, Sherif
    Moustafa, Nour
    Hu, Jiankun
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2018, 2 (04): : 278 - 287
  • [10] Monthly streamflow forecasting based on hidden Markov model and Gaussian Mixture Regression
    Liu, Yongqi
    Ye, Lei
    Qin, Hui
    Hong, Xiaofeng
    Ye, Jiajun
    Yin, Xingli
    JOURNAL OF HYDROLOGY, 2018, 561 : 146 - 159