Model-based clustering and segmentation of time series with changes in regime

被引:57
作者
Same, Allou [1 ]
Chamroukhi, Faicel [1 ]
Govaert, Gerard [2 ]
Aknin, Patrice [1 ]
机构
[1] Univ Paris Est, IFSTTAR, GRETTIA, F-93160 Noisy Le Grand, France
[2] Univ Technol Compiegne, HEUDIASYC, CNRS, UMR 6599, F-60205 Compiegne, France
关键词
Clustering; Time series; Change in regime; Mixture model; Regression mixture; Hidden logistic process; EM algorithm; MAXIMUM-LIKELIHOOD; MIXTURE MODEL;
D O I
10.1007/s11634-011-0096-5
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Mixture model-based clustering, usually applied to multidimensional data, has become a popular approach in many data analysis problems, both for its good statistical properties and for the simplicity of implementation of the Expectation-Maximization (EM) algorithm. Within the context of a railway application, this paper introduces a novel mixture model for dealing with time series that are subject to changes in regime. The proposed approach, called ClustSeg, consists in modeling each cluster by a regression model in which the polynomial coefficients vary according to a discrete hidden process. In particular, this approach makes use of logistic functions to model the (smooth or abrupt) transitions between regimes. The model parameters are estimated by the maximum likelihood method solved by an EM algorithm. This approach can also be regarded as a clustering approach which operates by finding groups of time series having common changes in regime. In addition to providing a time series partition, it therefore provides a time series segmentation. The problem of selecting the optimal numbers of clusters and segments is solved by means of the Bayesian Information Criterion. The ClustSeg approach is shown to be efficient using a variety of simulated time series and real-world time series of electrical power consumption from rail switching operations.
引用
收藏
页码:301 / 321
页数:21
相关论文
共 21 条
  • [1] [Anonymous], 2008, EM ALGORITHM EXTENSI
  • [2] [Anonymous], 1997, FUNCTIONAL DATA ANAL
  • [3] MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING
    BANFIELD, JD
    RAFTERY, AE
    [J]. BIOMETRICS, 1993, 49 (03) : 803 - 821
  • [4] Assessing a mixture model for clustering with the integrated completed likelihood
    Biernacki, C
    Celeux, G
    Govaert, G
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (07) : 719 - 725
  • [5] GAUSSIAN PARSIMONIOUS CLUSTERING MODELS
    CELEUX, G
    GOVAERT, G
    [J]. PATTERN RECOGNITION, 1995, 28 (05) : 781 - 793
  • [6] A hidden process regression model for functional data description. Application to curve discrimination
    Chamroukhi, Faicel
    Same, Allou
    Govaert, Gerard
    Aknin, Patrice
    [J]. NEUROCOMPUTING, 2010, 73 (7-9) : 1210 - 1221
  • [7] Functional clustering and identifying substructures of longitudinal data
    Chiou, Jeng-Min
    Li, Pai-Ling
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2007, 69 : 679 - 699
  • [8] Random effects mixture models for clustering electrical load series
    Coke, Geoffrey
    Tsao, Min
    [J]. JOURNAL OF TIME SERIES ANALYSIS, 2010, 31 (06) : 451 - 464
  • [9] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [10] Gaffney S, 1999, P 5 ACM SIGKDD INT C