DDR: an index method for large time-series datasets

被引:14
作者
An, JY
Chen, YPP
Chen, HX
机构
[1] Deakin Univ, Sch Informat Technol, Fac Sci & Technol, Melbourne, Vic 3125, Australia
[2] Australia Res Council Ctr Bioinformat, Melbourne, Vic, Australia
[3] Univ Tsukuba, Inst Informat Sci & Elect, Tsukuba, Ibaraki 305, Japan
基金
澳大利亚研究理事会;
关键词
time series; indexing; dimensionality reduction;
D O I
10.1016/j.is.2004.05.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The tree index structure is a traditional method for searching similar data in large datasets. It is based on the presupposition that most sub-trees are pruned in the searching process. As a result, the number of page accesses is reduced. However, time-series datasets generally have a very high dimensionality. Because of the so-called dimensionality curse, the pruning effectiveness is reduced in high dimensionality. Consequently, the tree index structure is not a suitable method for time-series datasets. In this paper, we propose a two-phase (filtering and refinement) method for searching time-series datasets. In the filtering step, a quantizing time-series is used to construct a compact file which is scanned for filtering out irrelevant. A small set of candidates is translated to the second step for refinement. In this step, we introduce an effective index compression method named grid-based datawise dimensionality reduction (DRR) which attempts to preserve the characteristics of the time-series. An experimental comparison with existing techniques demonstrates the utility of our approach. (c) 2004 Elsevier Ltd. All rights reserved.
引用
收藏
页码:333 / 348
页数:16
相关论文
共 19 条
[11]  
CHAKRABARTI K, 2000, P 26 INT C VER LARG, P151
[12]  
CHEN H, 2002, P WAIM2002, P303
[13]  
Guttman A., 1984, SIGMOD Record, V14, P47, DOI 10.1145/971697.602266
[14]  
Jagadish HV, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P102
[15]  
Katayama N., 1997, P ACM SIGMOD, P369
[16]   A simple dimensionality reduction technique for fast similarity search in large time series databases [J].
Keogh, EJ ;
Pazzani, MJ .
KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS: CURRENT ISSUES AND NEW APPLICATIONS, 2000, 1805 :122-133
[17]  
Moody G., 2000, MIT BIH DATABASE DIS
[18]  
Weber R., 1998, Proceedings of the Twenty-Fourth International Conference on Very-Large Databases, P194
[19]  
YI BK, 2000, P 26 INT C VER LARG, P385