Segmentation uncertainty in multiple change-point models

被引:0
作者
Yann Guédon
机构
[1] UMR AGAP and Inria,CIRAD
[2] Virtual Plants,undefined
来源
Statistics and Computing | 2015年 / 25卷
关键词
Entropy; Kullback–Leibler divergence; Latent structure model; Multiple change-point detection; Smoothing algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
This paper addresses the retrospective or off-line multiple change-point detection problem. In this context, there is a need of efficient diagnostic tools that enable to localize the segmentation uncertainty along the observed sequence. Concerning the segmentation uncertainty, the focus was mainly on the change-point position uncertainty. We propose to state this problem in a new way, viewing multiple change-point models as latent structure models and using results from information theory. This led us to show that the segmentation uncertainty is not reflected in the posterior distributions of the change-point position because of the marginalization that is intrinsic in the computation of these posterior distributions. The entropy of the segmentation of a given observed sequence can be considered as the canonical measure of segmentation uncertainty. This segmentation entropy can be decomposed as conditional entropy profiles that enables to localize this canonical segmentation uncertainty along the sequence. One of the main outcomes of this work is to derive efficient algorithms to compute these conditional entropy profiles. The proposed approach benefits from all the properties of the Shannon–Khinchin axioms of entropy and therefore is the unique approach for localizing the canonical segmentation uncertainty along the sequence. We introduce the Kullback–Leibler divergence of the uniform distribution from the segmentation distribution for successive numbers of change points as a new tool for assessing the number of change points selected by different methods. The proposed approach is illustrated using four contrasted examples.
引用
收藏
页码:303 / 320
页数:17
相关论文
共 32 条
  • [1] Chib S.(1998)Estimation and comparison of multiple change-point models J. Econom. 86 221-241
  • [2] Fearnhead P.(2006)Exact and efficient Bayesian inference for multiple changepoint problems Stat. Comput. 16 203-213
  • [3] Green P.J.(1995)Reversible jump Markov chain Monte Carlo computation and Bayesian model determination Biometrika 82 711-732
  • [4] Guédon Y.(2003)Estimating hidden semi-Markov chains from discrete sequences J. Comput. Graph. Stat. 12 604-639
  • [5] Guédon Y.(2007)Exploring the state sequence space for hidden Markov and semi-Markov chains Comput. Stat. Data Anal. 51 2379-2409
  • [6] Guédon Y.(2013)Exploring the latent segmentation space for the assessment of multiple change-point models Comput. Stat. 28 2641-2678
  • [7] Guédon Y.(2001)Pattern analysis in branching and axillary flowering sequences J. Theor. Biol. 212 481-520
  • [8] Barthélémy D.(2007)Analyzing growth components in trees J. Theor. Biol. 248 418-447
  • [9] Caraglio Y.(2005)Efficient computation of the hidden Markov model entropy for a given observation sequence IEEE Trans. Inf. Theory 51 2681-2685
  • [10] Costes E.(1979)A note on the intervals between coal-mining disasters Biometrika 66 191-193