Scalable Classification of Repetitive Time Series Through Frequencies of Local Polynomials

被引:13
作者
Grabocka, Josif [1 ]
Wistuba, Martin [1 ]
Schmidt-Thieme, Lars [1 ]
机构
[1] Univ Hildesheim, Informat Syst & Machine Learning Lab, D-31141 Hildesheim, Germany
关键词
Time-series classification; symbolic polynomials; bag-of-words; SIMILARITY;
D O I
10.1109/TKDE.2014.2377746
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Time-series classification has attracted considerable research attention due to the various domains where time-series data are observed, ranging from medicine to econometrics. Traditionally, the focus of time-series classification has been on short time-series data composed of a few patterns exhibiting variabilities, while recently there have been attempts to focus on longer series composed of multiple local patrepeating with an arbitrary irregularity. The primary contribution of this paper relies on presenting a method which can detect local patterns in repetitive time-series via fitting local polynomial functions of a specified degree. We capture the repetitiveness degrees of time-series datasets via a new measure. Furthermore, our method approximates local polynomials in linear time and ensures an overall linear running time complexity. The coefficients of the polynomial functions are converted to symbolic words via equi-area discretizations of the coefficients' distributions. The symbolic polynomial words enable the detection of similar local patterns by assigning the same word to similar polynomials. Moreover, a histogram of the frequencies of the words is constructed from each time-series' bag of words. Each row of the histogram enables a new representation for the series and symbolizes the occurrence of local patterns and their frequencies. In an experimental comparison against state-of-the-art baselines on repetitive datasets, our method demonstrates significant improvements in terms of prediction accuracy.
引用
收藏
页码:1683 / 1695
页数:13
相关论文
共 31 条
[1]  
[Anonymous], 2004, Proceedings of the 2004 ACM SIGMOD international conference on Management of data
[2]  
Buza K., 2008, STUDIES CLASSIFICATI, P105
[3]   SINGULAR-VALUE DECOMPOSITION APPROACH TO TIME-SERIES MODELING [J].
CADZOW, JA ;
BASEGHI, B ;
HSU, T .
IEE PROCEEDINGS-F RADAR AND SIGNAL PROCESSING, 1983, 130 (03) :202-210
[4]  
Camerra A., 2010, Proceedings 2010 10th IEEE International Conference on Data Mining (ICDM 2010), P58, DOI 10.1109/ICDM.2010.124
[5]   Efficient time series matching by wavelets [J].
Chan, KP ;
Fu, AWC .
15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, :126-133
[6]  
Chen L, 2004, P 30 INT C VER LARG, V30, P792, DOI [DOI 10.1016/B978-012088469-8.50070-X, 10.5555/1316689.1316758, DOI 10.5555/1316689.1316758]
[7]  
Chen L., 2005, P 2005 ACM SIGMOD IN, P491, DOI DOI 10.1145/1066157.1066213
[8]  
Chen YC, 2007, IEEE C ELEC DEVICES, P761
[9]  
Ding H, 2008, PROC VLDB ENDOW, V1, P1542
[10]  
Dongyu Zhang, 2010, Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR 2010), P29, DOI 10.1109/ICPR.2010.16