Hierarchical clustering of time-series data streams

被引:129
|
作者
Rodrigues, Pedro Pereira [1 ,2 ]
Gama, Joao [1 ,3 ]
Pedroso, Joao Pedro [4 ,5 ]
机构
[1] LIAAD INESC Porto LA, P-4050190 Oporto, Portugal
[2] Univ Porto, Fac Sci, P-4050190 Oporto, Portugal
[3] Univ Porto, Fac Econ, P-4050190 Oporto, Portugal
[4] UESP INESC, P-4169007 Oporto, Portugal
[5] Univ Porto, Fac Sci, P-4169007 Oporto, Portugal
关键词
data stream analysis; clustering streaming time series; incremental hierarchical clustering; change detection;
D O I
10.1109/TKDE.2007.190727
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents and analyzes an incremental system for clustering streaming time series. The Online Divisive-Agglomerative Clustering (ODAC) system continuously maintains a tree-like hierarchy of clusters that evolves with data, using a top-down strategy. The splitting criterion is a correlation-based dissimilarity measure among time series, splitting each node by the farthest pair of streams. The system also uses a merge operator that reaggregates a previously split node in order to react to changes in the correlation structure between time series. The split and merge operators are triggered in response to changes in the diameters of existing clusters, assuming that in stationary environments, expanding the structure leads to a decrease in the diameters of the clusters. The system is designed to process thousands of data streams that flow at a high rate. The main features of the system include update time and memory consumption that do not depend on the number of examples in the stream. Moreover, the time and memory required to process an example decreases whenever the cluster structure expands. Experimental results on artificial and real data assess the processing qualities of the system, suggesting a competitive performance on clustering streaming time series, exploring also its ability to deal with concept drift.
引用
收藏
页码:615 / 627
页数:13
相关论文
共 50 条
  • [31] Fast and Accurate Time-Series Clustering
    Paparrizos, John
    Gravano, Luis
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2017, 42 (02):
  • [32] Time-series clustering - A decade review
    Aghabozorgi, Saeed
    Shirkhorshidi, Ali Seyed
    Teh Ying Wah
    INFORMATION SYSTEMS, 2015, 53 : 16 - 38
  • [33] Clustering short time-series microarray
    Ping, Loh Wei
    Abu Hasan, Yahya
    INTERNATIONAL CONFERENCE ON MATHEMATICAL BIOLOGY 2007, 2008, 971 : 39 - 46
  • [34] A clustering model for time-series forecasting
    Coric, Rebeka
    Dumic, Mateja
    Jelic, Slobodan
    2019 42ND INTERNATIONAL CONVENTION ON INFORMATION AND COMMUNICATION TECHNOLOGY, ELECTRONICS AND MICROELECTRONICS (MIPRO), 2019, : 1105 - 1109
  • [35] Deep Time-Series Clustering: A Review
    Alqahtani, Ali
    Ali, Mohammed
    Xie, Xianghua
    Jones, Mark W.
    ELECTRONICS, 2021, 10 (23)
  • [36] Time-series Clustering by Approximate Prototypes
    Hautamaki, Ville
    Nykanen, Pekka
    Franti, Pasi
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 644 - 647
  • [37] Alignment versus Variation Methods for Clustering Microarray Time-Series Data
    Subhani, Numanul
    Li, Yifeng
    Ngom, Alioune
    Rueda, Luis
    2010 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2010,
  • [38] Time-series clustering of cage-level sea lice data
    Marques, Ana Rita
    Forde, Henny
    Revie, Crawford W.
    PLOS ONE, 2018, 13 (09):
  • [39] Two-Level Time-Series Clustering for Satellite Data Analysis
    Niimi, Ayahiko
    Yamaguchi, Takehiro
    Konishi, Osamu
    PROCEEDINGS OF THE SIXTEENTH INTERNATIONAL SYMPOSIUM ON ARTIFICIAL LIFE AND ROBOTICS (AROB 16TH '11), 2011, : 431 - 434
  • [40] A novel pattern based clustering methodology for time-series microarray data
    Phan, Sieu
    Famili, Fazel
    Tang, Zoujian
    Pan, Youlian
    Liu, Ziying
    Ouyang, Junjun
    Lenferink, Anne
    O'Connor, Maureen Mc-Court
    INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS, 2007, 84 (05) : 585 - 597