Exact indexing for massive time series databases under time warping distance

被引:18
作者
Niennattrakul, Vit [1 ]
Ruengronghirunya, Pongsakorn [1 ]
Ratanamahatana, Chotirat Ann [1 ]
机构
[1] Chulalongkorn Univ, Dept Comp Engn, Bangkok, Thailand
关键词
Time series; Indexing; Dynamic time warping; REPRESENTATION; ALGORITHM;
D O I
10.1007/s10618-010-0165-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Among many existing distance measures for time series data, Dynamic Time Warping (DTW) distance has been recognized as one of the most accurate and suitable distance measures due to its flexibility in sequence alignment. However, DTW distance calculation is computationally intensive. Especially in very large time series databases, sequential scan through the entire database is definitely impractical, even with random access that exploits some index structures since high dimensionality of time series data incurs extremely high I/O cost. More specifically, a sequential structure consumes high CPU but low I/O costs, while an index structure requires low CPU but high I/O costs. In this work, we therefore propose a novel indexed sequential structure called TWIST (Time Warping in Indexed Sequential sTructure) which benefits from both sequential access and index structure. When a query sequence is issued, TWIST calculates lower bounding distances between a group of candidate sequences and the query sequence, and then identifies the data access order in advance, hence reducing a great number of both sequential and random accesses. Impressively, our indexed sequential structure achieves significant speedup in a querying process. In addition, our method shows superiority over existing rival methods in terms of query processing time, number of page accesses, and storage requirement with no false dismissal guaranteed.
引用
收藏
页码:509 / 541
页数:33
相关论文
共 31 条
[11]  
Faloutsos Christos., 1994, ACM SIGMOD Record, V23, P419
[12]  
Guttman A., 1984, SIGMOD Record, V14, P47, DOI 10.1145/971697.602266
[13]   MINIMUM PREDICTION RESIDUAL PRINCIPLE APPLIED TO SPEECH RECOGNITION [J].
ITAKURA, F .
IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1975, AS23 (01) :67-72
[14]   Exact indexing of dynamic time warping [J].
Keogh, E ;
Ratanamahatana, CA .
KNOWLEDGE AND INFORMATION SYSTEMS, 2005, 7 (03) :358-386
[15]   On the need for time series data mining benchmarks: A survey and empirical demonstration [J].
Keogh, E ;
Kasetty, S .
DATA MINING AND KNOWLEDGE DISCOVERY, 2003, 7 (04) :349-371
[16]   Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases [J].
Eamonn Keogh ;
Kaushik Chakrabarti ;
Michael Pazzani ;
Sharad Mehrotra .
Knowledge and Information Systems, 2001, 3 (3) :263-286
[17]  
Keogh E. J., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P285, DOI 10.1145/347090.347153
[18]   An index-based approach for similarity search supporting time warping in large sequence databases [J].
Kim, SW ;
Park, S ;
Chu, WW .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :607-614
[19]   Experiencing SAX: a novel symbolic representation of time series [J].
Lin, Jessica ;
Keogh, Eamonn ;
Wei, Li ;
Lonardi, Stefano .
DATA MINING AND KNOWLEDGE DISCOVERY, 2007, 15 (02) :107-144
[20]   A subsequence matching algorithm that supports normalization transform in time-series databases [J].
Loh, WK ;
Kim, SW ;
Whang, KY .
DATA MINING AND KNOWLEDGE DISCOVERY, 2004, 9 (01) :5-28