A Novel Parallel Scheme for Fast Similarity Search in Large Time Series

被引:9
作者
Yin Hong [1 ,3 ]
Yang Shuqiang [1 ]
Ma Shaodong [2 ]
Liu Fei [1 ]
Chen Zhikun [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
[2] Univ Hull, Sch Engn, Kingston Upon Hull HU6 7RX, N Humberside, England
[3] Xiangyang Sch NCOs, Xiangyang 441118, Peoples R China
基金
中国国家自然科学基金;
关键词
similarity; DTW; warping path; time series; MapReduce; parallelization; cluster;
D O I
10.1109/CC.2015.7084408
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
The similarity search is one of the fundamental components in time series data mining, e.g. clustering, classification, association rules mining. Many methods have been proposed to measure the similarity between time series, including Euclidean distance, Manhattan distance, and dynamic time warping (DTW). In contrast, DTW has been suggested to allow more robust similarity measure and be able to find the optimal alignment in time series. However, due to its quadratic time and space complexity, DTW is not suitable for large time series datasets. Many improving algorithms have been proposed for DTW search in large databases, such as approximate search or exact indexed search. Unlike the previous modified algorithm, this paper presents a novel parallel scheme for fast similarity search based on DTW, which is called MRDTW (MapRedcue-based DTW). The experimental results show that our approach not only retained the original accuracy as DTW, but also greatly improved the efficiency of similarity measure in large time series.
引用
收藏
页码:129 / 140
页数:12
相关论文
共 25 条
  • [1] Adams N, 2005, ITERATIVE DEEPENING
  • [2] A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation
    Alon, Jonathan
    Athitsos, Vassilis
    Yuan, Quan
    Sclaroff, Stan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (09) : 1685 - 1699
  • [3] Assent I., 2008, EDBT, P252, DOI [10.1145/1353343.1353376, DOI 10.1145/1353343.1353376]
  • [4] Berndt Donald J, 1994, P 3 INT C KNOWLEDGE, V10, P359
  • [5] Catalano J, 2006, DISCOVERING PATTERNS
  • [6] Chadwick N. A., 2011, 2011 5th IEEE International Conference on Digital Ecosystems and Technologies (DEST 2011), P285, DOI 10.1109/DEST.2011.5936640
  • [7] Chodorow C, 2010, FREE OP SOURC SOFTW
  • [8] Chu S., 2002, SDM
  • [9] Das G., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P16
  • [10] Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137