Fuzzy clustering of time series data using dynamic time warping distance

被引:208
作者
Izakian, Hesam [1 ]
Pedrycz, Witold [1 ,2 ,3 ]
Jamal, Iqbal [4 ]
机构
[1] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6G 2V4, Canada
[2] King Abdulaziz Univ, Fac Engn, Dept Elect & Comp Engn, Jeddah 21589, Saudi Arabia
[3] Polish Acad Sci, Syst Res Inst, PL-00716 Warsaw, Poland
[4] AQL Management Consulting Inc, Edmonton, AB, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
Clustering time series; Dynamic Time Warping (DTW); Fuzzy clustering; Hybrid approach;
D O I
10.1016/j.engappai.2014.12.015
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is a powerful vehicle to reveal and visualize structure of data. When dealing with time series, selecting a suitable measure to evaluate the similarities/dissimilarities within the data becomes necessary and subsequently it exhibits a significant impact on the results of clustering. This selection should be based upon the nature of time series and the application itself. When grouping time series based on their shape information is of interest (shape-based clustering), using a Dynamic Time Warping (DTW) distance is a desirable choice. Using stretching or compressing segments of temporal data, DTW determines an optimal match between any two time series. In this way, time series exhibiting similar patterns occurring at different time periods, are considered as being similar. Although DTW is a suitable choice for comparing data with respect to their shape information, calculating the average of a collection of time series (which is required in clustering methods) based on this distance becomes a challenging problem. As the result, employing clustering techniques like K-Means and Fuzzy C-Means (where the cluster centers - prototypes are calculated through averaging the data) along with the DTW distance is a challenging task and may produce unsatisfactory results. In this study, three alternatives for fuzzy clustering of time series using DTW distance are proposed. In the first method, a DTW-based averaging technique proposed in the literature, has been applied to the Fuzzy C-Means clustering. The second method considers a Fuzzy C-Medoids clustering, while the third alternative comes as a hybrid technique, which exploits the advantages of both the Fuzzy C-Means and Fuzzy C-Medoids when clustering time series. Experimental studies are reported over a set of time series coming from the UCR time series database. (C) 2015 Elsevier Ltd. All rights reserved.
引用
收藏
页码:235 / 244
页数:10
相关论文
共 33 条
[1]  
[Anonymous], 2003, P WORKSH CLUST HIGH
[2]  
[Anonymous], 2009, 2009 6 INT C EL ENG
[3]  
[Anonymous], 2002, P 2 SIAM INT C DAT M
[4]  
[Anonymous], Pattern Recognition with Fuzzy Objective Function Algorithms,, DOI 10.1007/978-1-4757-0450-1_3
[5]   Correlation based dynamic time warping of multivariate time series [J].
Banko, Zoltan ;
Abonyi, Janos .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (17) :12814-12823
[6]  
Berndt Donald J, 1994, P 3 INT C KNOWLEDGE, V10, P359
[7]  
Chen L., 2005, PROC ACM SIGMOD INT, P491
[8]   Autocorrelation-based fuzzy clustering of time series [J].
D'Urso, Pierpaolo ;
Maharaj, Elizabeth Ann .
FUZZY SETS AND SYSTEMS, 2009, 160 (24) :3565-3589
[9]  
Ding H, 2008, PROC VLDB ENDOW, V1, P1542
[10]   Time-Series Data Mining [J].
Esling, Philippe ;
Agon, Carlos .
ACM COMPUTING SURVEYS, 2012, 45 (01)