Efficient Time Series Clustering by Minimizing Dynamic Time Warping Utilization

被引:29
作者
Cai, Borui [1 ]
Huang, Guangyan [1 ]
Samadiani, Najmeh [1 ]
Li, Guanghui [2 ]
Chi, Chi-Hung [3 ]
机构
[1] Deakin Univ, Sch Informat Technol, Burwood, Vic 3125, Australia
[2] Jiangnan Univ, Dept Comp Sci & Technol, Wuxi 214122, Jiangsu, Peoples R China
[3] Data61, Sandy Bay, Tas 7005, Australia
关键词
Time series; density-based clustering; dynamic time warping; L1-norm; SEARCH;
D O I
10.1109/ACCESS.2021.3067833
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dynamic Time Warping (DTW) is a widely used distance measurement in time series clustering. DTW distance is invariant to time series phase perturbations but has a quadratic complexity. An effective acceleration method must reduce the DTW utilization ratio during time series clustering; for example, TADPole uses both upper and lower bounds to prune off a large ratio of expensive DTW calculations. To further reduce the DTW utilization ratio, we find that the linear-complexity L1-norm distance (Manhattan distance) is effective enough when the time series only comprise small phase perturbations. Therefore, we propose a novel time series clustering by Minimizing Dynamic Time Warping Utilization (MiniDTW) algorithm to accelerate time series clustering. In MiniDTW, the dataset is first greedily summarized into seed clusters, which comprise time series of small phase perturbations, by L1-norm distance. Then, we develop a new Sparse Symmetric Non-negative Matrix Factorization (SSNMF) algorithm, which factorizes the DTW distance matrix of seed cluster centers, to merge the seed clusters into the final clusters. The experiments on UCR time series datasets demonstrate that MiniDTW, pruning 98.52% of the DTW utilization, is better than the counterpart method, TADPole, which only prunes 75.56% of the DTW utilization; and thus MiniDTW is 10 times faster than TADPole.
引用
收藏
页码:46589 / 46599
页数:11
相关论文
共 37 条
[1]   Time-series clustering - A decade review [J].
Aghabozorgi, Saeed ;
Shirkhorshidi, Ali Seyed ;
Teh Ying Wah .
INFORMATION SYSTEMS, 2015, 53 :16-38
[3]  
[Anonymous], 2006, P 12 ACM SIGKDD INT
[4]  
[Anonymous], 2009, Finding groups in data: an introduction to cluster analysis
[5]   Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy [J].
Begum, Nurjahan ;
Ulanova, Liudmila ;
Wang, Jun ;
Keogh, Eamonn .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :49-58
[6]   Unsupervised outlier detection for time series by entropy and dynamic time warping [J].
Benkabou, Seif-Eddine ;
Benabdeslem, Khalid ;
Canitia, Bruno .
KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 54 (02) :463-486
[7]  
Berndt D J., 1994, P KDD WORKSH SEATTL, P359, DOI DOI 10.5555/3000850.3000887
[8]   Financial time series forecasting model based on CEEMDAN and LSTM [J].
Cao, Jian ;
Li, Zhi ;
Li, Jian .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2019, 519 :127-139
[9]  
Chen Y., 2015, The UCR time series classification archive
[10]   Fast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations [J].
Cichocki, Andrzej ;
Phan, Anh-Huy .
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2009, E92A (03) :708-721