Orthogonal Mixture of Hidden Markov Models

被引:0
作者
Safinianaini, Negar [1 ]
de Souza, Camila P. E. [2 ]
Bostrom, Henrik [1 ]
Lagergren, Jens [1 ,3 ]
机构
[1] KTH Royal Inst Technol, Sch Elect Engn & Comp Sci, Stockholm, Sweden
[2] Univ Western Ontario, Dept Stat & Actuarial Sci, London, ON, Canada
[3] Sci Life Lab, Solna, Sweden
来源
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT I | 2021年 / 12457卷
关键词
Hidden markov models; Mixture models; Mixture of hidden markov models; Expectation maximization; Orthogonality; Regularization; Penalty; INFERENCE;
D O I
10.1007/978-3-030-67658-2_29
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mixtures of Hidden Markov Models (MHMM) are widely used for clustering of sequential data, by letting each cluster correspond to a Hidden Markov Model (HMM). Expectation Maximization (EM) is the standard approach for learning the parameters of an MHMM. However, due to the non-convexity of the objective function, EM can converge to poor local optima. To tackle this problem, we propose a novel method, the Orthogonal Mixture of Hidden Markov Models (oMHMM), which aims to direct the search away from candidate solutions that include very similar HMMs, since those do not fully exploit the power of the mixture model. The directed search is achieved by including a penalty in the objective function that favors higher orthogonality between the transition matrices of the HMMs. Experimental results on both simulated and real-world datasets show that the oMHMM consistently finds equally good or better local optima than the standard EM for an MHMM; for some datasets, the clustering performance is significantly improved by our novel oMHMM (up to 55 percentage points w.r.t. the v-measure). Moreover, the oMHMM may also decrease the computational cost substantially, reducing the number of iterations down to a fifth of those required by MHMM using standard EM.
引用
收藏
页码:509 / 525
页数:17
相关论文
共 34 条
[1]   Time-series clustering - A decade review [J].
Aghabozorgi, Saeed ;
Shirkhorshidi, Ali Seyed ;
Teh Ying Wah .
INFORMATION SYSTEMS, 2015, 53 :16-38
[2]   A rewriting system for convex optimization problems [J].
Agrawal A. ;
Verschueren R. ;
Diamond S. ;
Boyd S. .
Journal of Control and Decision, 2018, 5 (01) :42-60
[3]  
Alon J, 2003, PROC CVPR IEEE, P375
[4]  
Altosaar J., 2017, AISTATS
[5]  
[Anonymous], 2006, Pattern recognition and machine learning
[6]  
[Anonymous], 2014, Proceedings of Advances in Neural Information Processing Systems
[7]   STATISTICAL INFERENCE FOR PROBABILISTIC FUNCTIONS OF FINITE STATE MARKOV CHAINS [J].
BAUM, LE ;
PETRIE, T .
ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (06) :1554-&
[8]   Model-based machine learning [J].
Bishop, Christopher M. .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2013, 371 (1984)
[9]   Variational Inference: A Review for Statisticians [J].
Blei, David M. ;
Kucukelbir, Alp ;
McAuliffe, Jon D. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) :859-877
[10]   Model-based clustering and classification of functional data [J].
Chamroukhi, Faicel ;
Nguyen, Hien D. .
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2019, 9 (04)