Inference in finite state space non parametric Hidden Markov Models and applications

被引:0
作者
E. Gassiat
A. Cleynen
S. Robin
机构
[1] Université Paris-Sud,Laboratoire de Mathématique
[2] CNRS,Laboratoire de Mathématique
[3] AgroParisTech,undefined
[4] MIA 518,undefined
[5] INRA,undefined
[6] MIA 518,undefined
来源
Statistics and Computing | 2016年 / 26卷
关键词
Identifiability; Hidden Markov Models; Non-parametric;
D O I
暂无
中图分类号
学科分类号
摘要
Hidden Markov models (HMMs) are intensively used in various fields to model and classify data observed along a line (e.g. time). The fit of such models strongly relies on the choice of emission distributions that are most often chosen among some parametric family. In this paper, we prove that finite state space non parametric HMMs are identifiable as soon as the transition matrix of the latent Markov chain has full rank and the emission probability distributions are linearly independent. This general result allows the use of semi- or non-parametric emission distributions. Based on this result we present a series of classification problems that can be tackled out of the strict parametric framework. We derive the corresponding inference algorithms. We also illustrate their use on few biological examples, showing that they may improve the classification performances.
引用
收藏
页码:61 / 71
页数:10
相关论文
共 54 条
[1]  
Allman ES(2009)Identifiability of parameters in latent structure models with many observed variables Ann. Stat. 37 3099-3132
[2]  
Matias C(2010)Combining mixture components for clustering J. Comput. Gr. Stat. 19 332-353
[3]  
Rhodes J(2009)An EM-like algorithm for semi-and nonparametric estimation in multivariate mixtures J. Comput. Gr. Stat. 18 505-526
[4]  
Baudry J-P(2011)Unsupervised classification for tiling arrays: ChIP-chip and transcriptome Stat. Appl. Genet. Mol. Biol. 10 1-22
[5]  
Raftery AE(2006)Semiparametric estimation of a two components mixture model Ann. Stat. 34 1204-1232
[6]  
Celeux G(2006)Semiparametric estimation of a two-component mixture model where one component is known Scand. J. Stat. 33 733-752
[7]  
Lo K(2014)Semiparametric mixtures of symmetric distributions Scand. J. Stat. 41 227-239
[8]  
Gottardo R(1977)Maximum likelihood from incomplete data via the EM algorithm J. R. Stat. Soc. Ser. B 39 1-38
[9]  
Benaglia T(2011)Hidden Markov models for zero-inflated Poisson counts with an application to substance use Stat. Med. 30 1678-1694
[10]  
Chauveau D(2006)A supervised hidden Markov model framework for efficiently segmenting tiling array data in transcriptional and chIP-chip experiments: systematically incorporating validated biological knowledge Bioinformatics 22 3016-3024