HMM-based hybrid meta-clustering ensemble for temporal data

被引:25
作者
Yang, Yun [1 ]
Jiang, Jianmin [1 ,2 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin 300072, Peoples R China
[2] Univ Surrey, Dept Comp, Guildford GU2 7XH, Surrey, England
关键词
Temporal data clustering; Ensemble learning; Hidden Markov Model; Model selection; Time series; TIME-SERIES; MODEL;
D O I
10.1016/j.knosys.2013.12.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal data have many distinct characteristics, including high dimensionality, complex time dependency, and large volume, all of which make the temporal data clustering more challenging than conventional static datasets. In this paper, we propose a HMM-based partitioning ensemble based on hierarchical clustering refinement to solve the problems of initialization and model selection for temporal data clustering. Our approach results four major benefits, which can be highlighted as: (i) the model initialization problem is solved by associating the ensemble technique; (ii) the appropriate cluster number can be automatically determined by applying proposed consensus function on the multiple partitions obtained from the target dataset during clustering ensemble phase; (iii) no parameter re-estimation is required for the new merged pair of cluster, which significantly reduces the computing cost of its final refinement process based on HMM agglomerative clustering; and finally (iv) the composite model is better in characterizing the complex structure of clusters. Our approach has been evaluated on synthetic data and time series benchmark, and yields promising results for clustering tasks. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:299 / 310
页数:12
相关论文
共 56 条
  • [31] Hu XL, 2003, LECT NOTES COMPUT SC, V2690, P195
  • [32] Bagging-based spectral clustering ensemble selection
    Jia, Jianhua
    Xiao, Xuan
    Liu, Bingxiang
    Jiao, Licheng
    [J]. PATTERN RECOGNITION LETTERS, 2011, 32 (10) : 1456 - 1467
  • [33] A PROBABILISTIC DISTANCE MEASURE FOR HIDDEN MARKOV-MODELS
    JUANG, BH
    RABINER, LR
    [J]. AT&T TECHNICAL JOURNAL, 1985, 64 (02): : 391 - 408
  • [34] On the need for time series data mining benchmarks: A survey and empirical demonstration
    Keogh, E
    Kasetty, S
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2003, 7 (04) : 349 - 371
  • [35] Keogh E., 2003, TEMPORAL DATA MINING
  • [36] Li C., 2002, International Journal of Knowledge-Based Intelligent Engineering Systems, V6, P152
  • [37] Li C., 2000, INT C MACHINE LEARNI, P543
  • [38] Piecewise cloud approximation for time series mining
    Li, Hailin
    Guo, Chonghui
    [J]. KNOWLEDGE-BASED SYSTEMS, 2011, 24 (04) : 492 - 500
  • [39] Lin J, 2009, LECT NOTES COMPUT SC, V5566, P461, DOI 10.1007/978-3-642-02279-1_33
  • [40] Fuzzy clustering of time series in the frequency domain
    Maharaj, Elizabeth Ann
    D'Urso, Pierpaolo
    [J]. INFORMATION SCIENCES, 2011, 181 (07) : 1187 - 1211