An efficient similarity searching algorithm based on clustering for time series

被引:0
|
作者
Feng, Yucai [1 ]
Jiang, Tao [1 ]
Zhou, Yingbiao [1 ]
Li, Junkui [1 ]
机构
[1] Huazhong Univ Sci & Technol, Coll Comp Sci & Technol, Wuhan 430074, Peoples R China
来源
ADVANCES IN DATA MINING, PROCEEDINGS: MEDICAL APPLICATIONS, E-COMMERCE, MARKETING, AND THEORETICAL ASPECTS | 2008年 / 5077卷
关键词
time series; clustering; similarity search; indexing;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Indexing large time series databases is crucial for efficient searching of time series queries. In the paper, we propose a novel indexing scheme RQI (Range Query based on Index) which includes three filtering methods: first-k filtering, indexing lower bounding and upper bounding as well as triangle inequality pruning. The basic idea is calculating wavelet coefficient whose first k coefficients are used to form a MBR. (minimal bounding rectangle) based on haar wavelet transform for each time series and then using point filtering method; At the same time, lower bounding and upper bounding feature of each time series is calculated, in advance, and stored into index structure. At last, triangle inequality pruning method is used by calculating the distance between time series beforehand. Then we introduce a novel lower bounding distance function SLBS (Symmetrical Lower Bounding based on Segment) and a novel clustering algorithm CSA (Clustering based on Segment Approximation) in order to further improve the search efficiency of point filtering method by keeping a good clustering trait of index structure. Extensive experiments over both synthetic and real datasets show that, our technologies provide perfect pruning power and could obtain an order of magnitude performance improvement for time series queries over traditional naive evaluation techniques.
引用
收藏
页码:360 / 373
页数:14
相关论文
共 50 条
  • [41] Trend and Value based Time Series Representation for Similarity Search
    Kane, Aminata
    2017 IEEE THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM 2017), 2017, : 252 - 259
  • [42] Time Series Similarity Search based on Middle Points and Clipping
    Nguyen Thanh Son
    Duong Tuan Anh
    2011 3RD CONFERENCE ON DATA MINING AND OPTIMIZATION (DMO), 2011, : 13 - 19
  • [43] Cloud Model-Based Adaptive Time-Series Information Granulation Algorithm and Its Similarity Measurement
    Chen, Hailan
    Gao, Xuedong
    Wu, Qi
    Huang, Ruojin
    ENTROPY, 2025, 27 (02)
  • [44] Time-series clustering - A decade review
    Aghabozorgi, Saeed
    Shirkhorshidi, Ali Seyed
    Teh Ying Wah
    INFORMATION SYSTEMS, 2015, 53 : 16 - 38
  • [45] Feature-Based Online Representation Algorithm for Streaming Time Series Similarity Search
    Zhan, Peng
    Sun, Changchang
    Hu, Yupeng
    Luo, Wei
    Zheng, Jiecai
    Li, Xueqing
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2020, 34 (05)
  • [46] Efficient k-NN Searching over Large Uncertain Time Series Database
    Qian, Ailing
    Ding, Xiaofeng
    Lu, Yansheng
    JOURNAL OF RESEARCH AND PRACTICE IN INFORMATION TECHNOLOGY, 2014, 46 (01): : 47 - 60
  • [47] Financial Time Series Clustering
    Gupta, Kartikay
    Chatterjee, Niladri
    INFORMATION AND COMMUNICATION TECHNOLOGY FOR INTELLIGENT SYSTEMS (ICTIS 2017) - VOL 2, 2018, 84 : 146 - 156
  • [48] Efficient Time Series Clustering by Minimizing Dynamic Time Warping Utilization
    Cai, Borui
    Huang, Guangyan
    Samadiani, Najmeh
    Li, Guanghui
    Chi, Chi-Hung
    IEEE ACCESS, 2021, 9 : 46589 - 46599
  • [49] A Similarity Model Based On Trend For Time Series
    Chen, ShuaiFei
    Lv, Xin
    Yu, Lin
    Mao, YingChi
    Wang, LongBao
    Ma, HongXu
    14TH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS FOR BUSINESS, ENGINEERING AND SCIENCE (DCABES 2015), 2015, : 435 - 438
  • [50] Clustering of Time Series Based on Forecasting Performance of Global Models
    Lopez-Oriona, Angel
    Montero-Manso, Pablo
    Vilar, Jose A.
    ADVANCED ANALYTICS AND LEARNING ON TEMPORAL DATA, AALTD 2022, 2023, 13812 : 18 - 33