Piecewise Trend Approximation: A Ratio-Based Time Series Representation

被引:5
作者
Dan, Jingpei [1 ]
Shi, Weiren [2 ]
Dong, Fangyan [3 ]
Hirota, Kaoru [3 ]
机构
[1] Chongqing Univ, Coll Comp Sci, Chongqing 400044, Peoples R China
[2] Chongqing Univ, Sch Automat, Chongqing 400044, Peoples R China
[3] Tokyo Inst Technol, Dept Computat Intelligence & Syst Sci, Midori Ku, Yokohama, Kanagawa 2268502, Japan
关键词
D O I
10.1155/2013/603629
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
A time series representation, piecewise trend approximation (PTA), is proposed to improve efficiency of time series data mining in high dimensional large databases. PTA represents time series in concise form while retaining main trends in original time series; the dimensionality of original data is therefore reduced, and the key features are maintained. Different from the representations that based on original data space, PTA transforms original data space into the feature space of ratio between any two consecutive data points in original time series, of which sign and magnitude indicate changing direction and degree of local trend, respectively. Based on the ratio-based feature space, segmentation is performed such that each two conjoint segments have different trends, and then the piecewise segments are approximated by the ratios between the first and last points within the segments. To validate the proposed PTA, it is compared with classical time series representations PAA and APCA on two classical datasets by applying the commonly used K-NN classification algorithm. For ControlChart dataset, PTA outperforms them by 3.55% and 2.33% higher classification accuracy and 8.94% and 7.07% higher for Mixed-BagShapes dataset, respectively. It is indicated that the proposed PTA is effective for high dimensional time series data mining.
引用
收藏
页数:7
相关论文
共 29 条
[1]  
[Anonymous], 1997, Introduction to Wavelets and WaveletTransforms: A Primer
[2]  
Berndt DJ., 1994, USING DYNAMIC TIME W, DOI DOI 10.5555/3000850.3000887
[3]  
BRISTOL EH, 1990, ADV INSTR, V45, P749
[4]  
Cai Y., 2004, Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, SIGMOD '04, P599, DOI [DOI 10.1145/1007568.1007636, 10.1145/1007568.1007636]
[5]   Locally adaptive dimensionality reduction for indexing large time series databases [J].
Chakrabarti, K ;
Keogh, E ;
Mehrotra, S ;
Pazzani, M .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2002, 27 (02) :188-228
[6]   Efficient time series matching by wavelets [J].
Chan, KP ;
Fu, AWC .
15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, :126-133
[7]   NEAREST NEIGHBOR PATTERN CLASSIFICATION [J].
COVER, TM ;
HART, PE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1967, 13 (01) :21-+
[8]  
Ding YW, 2010, INT CONF COMPUT AUTO, P52, DOI 10.1109/ICCAE.2010.5451780
[9]   An improved process data compression algorithm [J].
Feng, XD ;
Cheng, CL ;
Liu, CL ;
Shao, HE .
PROCEEDINGS OF THE 4TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-4, 2002, :2190-2193
[10]   A time series representation model for accurate and fast similarity detection [J].
Gullo, Francesco ;
Ponti, Giovanni ;
Tagarelli, Andrea ;
Greco, Sergio .
PATTERN RECOGNITION, 2009, 42 (11) :2998-3014