Orthogonal Mixture of Hidden Markov Models

被引：0

作者：

Safinianaini, Negar ^{[1
]}

de Souza, Camila P. E. ^{[2
]}

Bostrom, Henrik ^{[1
]}

Lagergren, Jens ^{[1
,3
]}

机构：

[1] KTH Royal Inst Technol, Sch Elect Engn & Comp Sci, Stockholm, Sweden

[2] Univ Western Ontario, Dept Stat & Actuarial Sci, London, ON, Canada

[3] Sci Life Lab, Solna, Sweden

来源：

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2020, PT I | 2021年 / 12457卷

关键词：

Hidden markov models; Mixture models; Mixture of hidden markov models; Expectation maximization; Orthogonality; Regularization; Penalty; INFERENCE;

D O I：

10.1007/978-3-030-67658-2_29

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Mixtures of Hidden Markov Models (MHMM) are widely used for clustering of sequential data, by letting each cluster correspond to a Hidden Markov Model (HMM). Expectation Maximization (EM) is the standard approach for learning the parameters of an MHMM. However, due to the non-convexity of the objective function, EM can converge to poor local optima. To tackle this problem, we propose a novel method, the Orthogonal Mixture of Hidden Markov Models (oMHMM), which aims to direct the search away from candidate solutions that include very similar HMMs, since those do not fully exploit the power of the mixture model. The directed search is achieved by including a penalty in the objective function that favors higher orthogonality between the transition matrices of the HMMs. Experimental results on both simulated and real-world datasets show that the oMHMM consistently finds equally good or better local optima than the standard EM for an MHMM; for some datasets, the clustering performance is significantly improved by our novel oMHMM (up to 55 percentage points w.r.t. the v-measure). Moreover, the oMHMM may also decrease the computational cost substantially, reducing the number of iterations down to a fifth of those required by MHMM using standard EM.

引用

页码：509 / 525

页数：17

共 34 条

[1] Time-series clustering - A decade review [J].

Aghabozorgi, Saeed ;

Shirkhorshidi, Ali Seyed ;

Teh Ying Wah .

INFORMATION SYSTEMS, 2015, 53 :16-38

[2] A rewriting system for convex optimization problems [J].

Agrawal A. ;

Verschueren R. ;

Diamond S. ;

Boyd S. .

Journal of Control and Decision, 2018, 5 (01) :42-60

[3]

Alon J, 2003, PROC CVPR IEEE, P375

[4]

Altosaar J., 2017, AISTATS

[5]

[Anonymous], 2006, Pattern recognition and machine learning

[6]

[Anonymous], 2014, Proceedings of Advances in Neural Information Processing Systems

[7] STATISTICAL INFERENCE FOR PROBABILISTIC FUNCTIONS OF FINITE STATE MARKOV CHAINS [J].

BAUM, LE ;

PETRIE, T .

ANNALS OF MATHEMATICAL STATISTICS, 1966, 37 (06) :1554-&

[8] Model-based machine learning [J].

Bishop, Christopher M. .

PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2013, 371 (1984)

[9] Variational Inference: A Review for Statisticians [J].

Blei, David M. ;

Kucukelbir, Alp ;

McAuliffe, Jon D. .

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) :859-877

[10] Model-based clustering and classification of functional data [J].

Chamroukhi, Faicel ;

Nguyen, Hien D. .

WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2019, 9 (04)

← 1 2 3 4 →