DDR: an index method for large time-series datasets

被引：14

作者：

An, JY

Chen, YPP

Chen, HX

机构：

[1] Deakin Univ, Sch Informat Technol, Fac Sci & Technol, Melbourne, Vic 3125, Australia

[2] Australia Res Council Ctr Bioinformat, Melbourne, Vic, Australia

[3] Univ Tsukuba, Inst Informat Sci & Elect, Tsukuba, Ibaraki 305, Japan

来源：

INFORMATION SYSTEMS | 2005年 / 30卷 / 05期

基金：

澳大利亚研究理事会;

关键词：

time series; indexing; dimensionality reduction;

D O I：

10.1016/j.is.2004.05.001

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The tree index structure is a traditional method for searching similar data in large datasets. It is based on the presupposition that most sub-trees are pruned in the searching process. As a result, the number of page accesses is reduced. However, time-series datasets generally have a very high dimensionality. Because of the so-called dimensionality curse, the pruning effectiveness is reduced in high dimensionality. Consequently, the tree index structure is not a suitable method for time-series datasets. In this paper, we propose a two-phase (filtering and refinement) method for searching time-series datasets. In the filtering step, a quantizing time-series is used to construct a compact file which is scanned for filtering out irrelevant. A small set of candidates is translated to the second step for refinement. In this step, we introduce an effective index compression method named grid-based datawise dimensionality reduction (DRR) which attempts to preserve the characteristics of the time-series. An experimental comparison with existing techniques demonstrates the utility of our approach. (c) 2004 Elsevier Ltd. All rights reserved.

引用

页码：333 / 348

页数：16

共 19 条

[11]

CHAKRABARTI K, 2000, P 26 INT C VER LARG, P151

[12]

CHEN H, 2002, P WAIM2002, P303

[13]

Guttman A., 1984, SIGMOD Record, V14, P47, DOI 10.1145/971697.602266

[14]

Jagadish HV, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P102

[15]

Katayama N., 1997, P ACM SIGMOD, P369

[16] A simple dimensionality reduction technique for fast similarity search in large time series databases [J].

Keogh, EJ ;

Pazzani, MJ .

KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS: CURRENT ISSUES AND NEW APPLICATIONS, 2000, 1805 :122-133

[17]

Moody G., 2000, MIT BIH DATABASE DIS

[18]

Weber R., 1998, Proceedings of the Twenty-Fourth International Conference on Very-Large Databases, P194

[19]

YI BK, 2000, P 26 INT C VER LARG, P385

← 1 2 →