Nearest Neighbor Subsequence Search in Time Series Data

被引:0
作者
Ahsan, Ramoza [1 ]
Bashir, Muzammil [1 ]
Neamtu, Rodica [1 ]
Rundensteiner, Elke A. [1 ]
Sarkozy, Gabor [1 ]
机构
[1] Worcester Polytech Inst, Worcester, MA 01609 USA
来源
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2019年
关键词
Time Series Data; Subsequence Mining; Nearest Neighbor Search;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Continuous growth in sensor data and other temporal sequence data necessitates efficient retrieval and similarity search support On these big time series dalasels. However, finding exact similarity results, especially at the granularity of subsequences, is known to be prohibitively costly for large data sets. In this paper, we thus propose an efficient framework for solving this exact subsequence similarity match problem, called TINN (Time series Nearest Neighbor search). Exploiting the range interval diversity properties of time series datasets, TINN captures similarity at two levels of abstraction, namely, relationships among subsequences within each long time series and relationships across distinct time series in the data set. These relationships are compactly organized in an augmented relationship graph model, with the former relationships encoded in similarity vectors at TINN nodes and the later captured by augmented edge types in the TINN Graph. Query processing strategy deploy novel pruning techniques on the TINN Graph, including node skipping, vertical and horizontal pruning, to significantly reduce the number of time series as well as subsequences to be explored. Comprehensive experiments on synthetic and real world lime series data demonstrate that our T INN model consistently outperforms state-of-the-art approaches while still guaranteeing to retrieve exact matches.
引用
收藏
页码:2057 / 2066
页数:10
相关论文
共 25 条
[1]  
Aho A. V., 1972, SIAM Journal on Computing, V1, P131, DOI 10.1137/0201008
[2]   kNNVWC: An Efficient k-Nearest Neighbors Approach Based on Various-Widths Clustering [J].
Almalawi, Abdul Mohsen ;
Fahad, Adil ;
Tari, Zahir ;
Cheema, Muhammad Aamir ;
Khalil, Ibrahim .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (01) :68-81
[3]  
[Anonymous], 1977, ACM T MATH SOFTWARE
[4]  
Berndt D.J., 1994, Advances in Knowledge Discovery and Data Mining, P359
[5]  
Nguyen C, 2017, IEEE INT CONF BIG DA, P3530, DOI 10.1109/BigData.2017.8258343
[6]  
Ding H, 2008, PROC VLDB ENDOW, V1, P1542
[7]   Time-Series Data Mining [J].
Esling, Philippe ;
Agon, Carlos .
ACM COMPUTING SURVEYS, 2012, 45 (01)
[8]  
Faloutsos C., 1994, FAST SUBSEQUENCE MAT, V23
[9]  
Friedman J. H., 1977, ACM Transactions on Mathematical Software, V3, P209, DOI 10.1145/355744.355745
[10]   A review on time series data mining [J].
Fu, Tak-chung .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2011, 24 (01) :164-181