DDR: an index method for large time-series datasets

被引:14
作者
An, JY
Chen, YPP
Chen, HX
机构
[1] Deakin Univ, Sch Informat Technol, Fac Sci & Technol, Melbourne, Vic 3125, Australia
[2] Australia Res Council Ctr Bioinformat, Melbourne, Vic, Australia
[3] Univ Tsukuba, Inst Informat Sci & Elect, Tsukuba, Ibaraki 305, Japan
基金
澳大利亚研究理事会;
关键词
time series; indexing; dimensionality reduction;
D O I
10.1016/j.is.2004.05.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The tree index structure is a traditional method for searching similar data in large datasets. It is based on the presupposition that most sub-trees are pruned in the searching process. As a result, the number of page accesses is reduced. However, time-series datasets generally have a very high dimensionality. Because of the so-called dimensionality curse, the pruning effectiveness is reduced in high dimensionality. Consequently, the tree index structure is not a suitable method for time-series datasets. In this paper, we propose a two-phase (filtering and refinement) method for searching time-series datasets. In the filtering step, a quantizing time-series is used to construct a compact file which is scanned for filtering out irrelevant. A small set of candidates is translated to the second step for refinement. In this step, we introduce an effective index compression method named grid-based datawise dimensionality reduction (DRR) which attempts to preserve the characteristics of the time-series. An experimental comparison with existing techniques demonstrates the utility of our approach. (c) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:333 / 348
页数:16
相关论文
共 19 条
[1]  
AGRWAWAL R, 1993, P 4 C FODO, P69
[2]  
AN J, 2002, P 13 AUSTR DAT C, P23
[3]  
AN J, 2002, P PAN YELLW SEA INT, P135
[4]  
AN J, 2003, P 4 INT C INT DAT EN, P614
[5]  
[Anonymous], P 1998 ACM SIGMOD IN
[6]  
[Anonymous], 2002, UCR TIME SERIES DATA
[7]  
[Anonymous], P 9 INT C INF KNOWL
[8]  
[Anonymous], 2001, P ACM SIGMOD C MAN D
[9]  
BECKMANN N, 1990, SIGMOD REC, V19, P322, DOI 10.1145/93605.98741
[10]   The hybrid tree: An index structure for high dimensional feature spaces [J].
Chakrabarti, K ;
Mehrotra, S .
15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, :440-447