Dynamic Time Warping Under Product Quantization, With Applications to Time-Series Data Similarity Search

被引:16
作者
Zhang, Haowen [1 ]
Dong, Yabo [1 ]
Li, Jing [1 ]
Xu, Duanqing [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci & Technol, Hangzhou 310012, Peoples R China
关键词
Time series analysis; Databases; Time measurement; Internet of Things; Search problems; Quantization (signal); Activity recognition; DTW barycenter averaging (DBA); dynamic time warping (DTW); filter-and-refine; product quantization (PQ); time series; AVERAGING METHOD; CLASSIFICATION; DISTANCE;
D O I
10.1109/JIOT.2021.3132017
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The similarity search on sensor data generated by a myriad of sensing devices is a frequently encountered problem in the era of the Internet of Things (IoT). This sensor data generally appear in the form of time series, a temporally ordered sequence of real numbers obtained regularly in time. It has been widely accepted that the dynamic time warping (DTW) currently is the most prevalent similarity measure in the time-series mining community, mainly due to its flexibility and broad applicability. However, calculating DTW between two time series has quadratic time complexity, leading to unsatisfactory efficiency when performing the similarity search over the large time-series data set. The main contribution of this article is to propose a method called product quantization (PQ)-based DTW (PQDTW) for fast time-series approximate similarity search under DTW. The PQ, a well-known approximate nearest neighbor search approach, is used in PQDTW. Nevertheless, the conventional PQ is developed with the Euclidean distance and is not designed for DTW. To solve this problem, the DTW barycenter averaging (DBA) technique is utilized to adapt the PQ for DTW before using it. We employ PQDTW along with the filter-and-refine framework to efficiently and accurately perform the time-series similarity search. Our method can reasonably reduce many DTW computations in the filtering phase; thus, the query process is accelerated. We compare PQDTW with related popular algorithms using public time-series data sets. Experimental results verify that the proposal achieves the best tradeoff between query efficiency and retrieval accuracy compared to the competitors.
引用
收藏
页码:11814 / 11826
页数:13
相关论文
共 53 条
  • [1] The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances
    Bagnall, Anthony
    Lines, Jason
    Bostrom, Aaron
    Large, James
    Keogh, Eamonn
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY, 2017, 31 (03) : 606 - 660
  • [2] Berndt DJ, 1994, P 3 INT C KNOWL DISC, P359
  • [3] Chen Lei, 2004, P 30 INT C VER LARG, P792, DOI DOI 10.1016/B978-012088469-8.50070-X
  • [4] ADF: An Anomaly Detection Framework for Large-Scale PM2.5 Sensing Systems
    Chen, Ling-Jyh
    Ho, Yao-Hua
    Hsieh, Hsin-Hung
    Huang, Shih-Ting
    Lee, Hu-Cheng
    Mahajan, Sachit
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2018, 5 (02): : 559 - 570
  • [5] Cormen T. H., 2009, INTRO ALGORITHMS
  • [6] Vehicle Trajectory Similarity: Models, Methods, and Applications
    De Sousa, Roniel S.
    Boukerche, Azzedine
    Loureiro, Antonio A. F.
    [J]. ACM COMPUTING SURVEYS, 2020, 53 (05)
  • [7] Demsar J, 2006, J MACH LEARN RES, V7, P1
  • [8] Optimized Product Quantization
    Ge, Tiezheng
    He, Kaiming
    Ke, Qifa
    Sun, Jian
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (04) : 744 - 755
  • [9] Quantization
    Gray, RM
    Neuhoff, DL
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (06) : 2325 - 2383
  • [10] An Early Classification Approach for Multivariate Time Series of On-Vehicle Sensors in Transportation
    Gupta, Ashish
    Gupta, Hari Prabhat
    Biswas, Bhaskar
    Dutta, Tanima
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (12) : 5316 - 5327