A new framework for mining weighted periodic patterns in time series databases

被引:52
作者
Chanda, Ashis Kumar [1 ]
Ahmed, Chowdhury Farhan [1 ,2 ]
Samiullah, Md [1 ]
Leung, Carson K. [3 ]
机构
[1] Univ Dhaka, Dept Comp Sci & Engn, Dhaka, Bangladesh
[2] Univ Strasbourg, ICube Lab, Strasbourg, France
[3] Univ Manitoba, Dept Comp Sci, Winnipeg, MB, Canada
关键词
Data mining; Time series databases; Periodic pattern; Weighted pattern; Suffix tree; Flexible pattern; DISCOVERY;
D O I
10.1016/j.eswa.2017.02.028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mining periodic patterns in time series databases is a daunting research task that plays a significant role at decision making in real life applications. There are many algorithms for mining periodic patterns in time series, where all patterns are considered as uniformly same. However, in real life applications, such as market basket analysis, gene analysis and network fault experiment, different types of items are found with several levels of importance. Again, the existing algorithms generate huge periodic patterns in dense databases or in low minimum support, where most of the patterns are not important enough to participate in decision making. Hence, a pruning mechanism is essential to reduce these unimportant patterns. As a purpose of mining only important patterns in a minimal time period, we propose a weight based framework by assigning different weights to different items. Moreover, we develop a novel algorithm, WPPM (Weighted Periodic Pattern Mining Algorithm), in time series databases underlying suffix trie structure. To the best of our knowledge, ours is the first proposal that can mine three types of weighted periodic pattern, (i.e. single, partial, full) in a single run. A pruning method is introduced by following downward property, with respect of the maximum weight of a given database, to discard unimportant patterns. The proposed algorithm presents flexibility to user by providing intermediate unimportant pattern skipping opportunity and setting different starting positions in the time series sequence. The performance of our proposed algorithm is evaluated on real life datasets by varying different parameters. At the same time, a comparison between the proposed and an existing algorithm is shown, where the proposed approach outperformed the existing algorithm in terms of time and pattern generation. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:207 / 224
页数:18
相关论文
共 51 条
  • [1] Agarwal N., 2009, Synthesis Lectures on Data Mining and Knowledge Discovery, V1, P1
  • [2] AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
  • [3] Agrawal R., 1994, P 20 INT C VER LARG, V1215, P487, DOI DOI 10.5555/645920.672836
  • [4] Interactive mining of high utility patterns over data streams
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    Choi, Ho-Jin
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (15) : 11979 - 11991
  • [5] Single-pass incremental and interactive mining for weighted frequent patterns
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    Lee, Young-Koo
    Choi, Ho-Jin
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (09) : 7976 - 7994
  • [6] Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases
    Ahmed, Chowdhury Farhan
    Tanbeer, Syed Khairuzzaman
    Jeong, Byeong-Soo
    Lee, Young-Koo
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (12) : 1708 - 1721
  • [7] Mining top-k frequent-regular closed patterns
    Amphawan, Komate
    Lenca, Philippe
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (21) : 7882 - 7894
  • [8] Mining periodic-frequent itemsets with approximate periodicity using interval transaction-ids list tree
    Amphawan, Komate
    Surarerks, Athatsit
    Lenca, Philippe
    [J]. THIRD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING: WKDD 2010, PROCEEDINGS, 2010, : 245 - 248
  • [9] [Anonymous], 2018, TIME SERIES PREDICTI
  • [10] Ayres J., 2002, P ACM SIGKDD INT C K, P429