An Efficient Method for Mining Top-K Closed Sequential Patterns

被引:10
作者
Pham, Thi-Thiet [1 ]
Do, Tung [2 ]
Nguyen, Anh [3 ]
Vo, Bay [4 ]
Hong, Tzung-Pei [5 ,6 ]
机构
[1] Ind Univ Ho Chi Minh City, Fac Informat Technol, Ho Chi Minh City 700000, Vietnam
[2] Van Lang Univ, Fac Basic Sci, Ho Chi Minh City 700000, Vietnam
[3] Duy Tan Univ, Inst Res & Dev, Da Nang 550000, Vietnam
[4] Ho Chi Minh City Univ Technol HUTECH, Fac Informat Technol, Ho Chi Minh City 700000, Vietnam
[5] Natl Univ Kaohsiung, Dept Comp Sci & Informat Engn, Kaohsiung 811, Taiwan
[6] Natl Sun Yat Sen Univ, Dept Comp Sci & Engn, Kaohsiung 804, Taiwan
关键词
Closed sequential pattern; data mining; sequential pattern; top-k sequential patterns; FREQUENT PATTERNS; ALGORITHMS;
D O I
10.1109/ACCESS.2020.3004528
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of exploiting Closed Sequential Patterns (CSPs) is an essential task in data mining, with many different applications. It is used to resolve the situations of huge databases or low minimum support (minsup) thresholds in mining sequential patterns. However, it is challenging and needs a lot of time to customize the minsup values for generating appropriate numbers of CSPs desired by users. To conquer this issue, the TSP algorithm for mining top-k CSPs was previously proposed, with k being a given parameter. The algorithm would return the k CSPs which have the highest support values in a database. However, its execution time and memory usage were high. In this paper, an algorithm named TKCS (Top-K Closed Sequences) is proposed to mine the top-k CSPs efficiently. To improve the execution time and memory usage, it uses a vertical bitmap database to represent data. Besides, it adopts some useful strategies in the process of exploiting the top-k CSPs such as: always choosing the sequential patterns with the greatest support values for generating candidate patterns and storing top-k CSPs in an ascending order of the support values to increase the minsup value more quickly. The empirical results show that TKCS has better performance than TSP for discovering the top-k CSPs in terms of both runtime and memory usage.
引用
收藏
页码:118156 / 118163
页数:8
相关论文
共 50 条
  • [1] Mining Top-K Sequential Patterns in the Data Stream Environment
    Dai, Bi-Ru
    Jiang, Hung-Lin
    Chung, Chih-Heng
    INTERNATIONAL CONFERENCE ON TECHNOLOGIES AND APPLICATIONS OF ARTIFICIAL INTELLIGENCE (TAAI 2010), 2010, : 142 - 149
  • [2] Mining top-k approximate closed patterns in an imprecise database
    Yu, Xiaomei
    Wang, Hong
    Zheng, Xiangwei
    INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2018, 9 (02) : 97 - 107
  • [3] Skopus: Mining top-k sequential patterns under leverage
    Petitjean, Francois
    Li, Tao
    Tatti, Nikolaj
    Webb, Geoffrey I.
    DATA MINING AND KNOWLEDGE DISCOVERY, 2016, 30 (05) : 1086 - 1111
  • [4] Skopus: Mining top-k sequential patterns under leverage
    François Petitjean
    Tao Li
    Nikolaj Tatti
    Geoffrey I. Webb
    Data Mining and Knowledge Discovery, 2016, 30 : 1086 - 1111
  • [5] Mining Top-k distinguishing sequential patterns with gap constraint
    School of Computer Science, Sichuan University, Chengdu
    610065, China
    不详
    210003, China
    不详
    610041, China
    不详
    210003, China
    Ruan Jian Xue Bao, 11 (2994-3009): : 2994 - 3009
  • [6] Efficient algorithms of mining top-k frequent closed itemsets
    Lan Yongjie
    Qiu Yong
    ICEMI 2007: PROCEEDINGS OF 2007 8TH INTERNATIONAL CONFERENCE ON ELECTRONIC MEASUREMENT & INSTRUMENTS, VOL II, 2007, : 551 - 554
  • [7] TFP: An efficient algorithm for mining top-K frequent closed itemsets
    Wang, JY
    Han, JW
    Lu, Y
    Tzvetkov, P
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (05) : 652 - 664
  • [8] Mining Top-k Distinguishing Temporal Sequential Patterns from Event Sequences
    Duan, Lei
    Yan, Li
    Dong, Guozhu
    Nummenmaa, Jyrki
    Yang, Hao
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2017), PT II, 2017, 10178 : 235 - 250
  • [9] DEVELOPMENT OF AN EFFICIENT TECHNIQUE FOR MINING TOP-K CLOSED HIGH UTILITY ITEMSETS
    Velayudhan, Baby
    Sakthivel
    Subasree
    IIOAB JOURNAL, 2016, 7 (09) : 150 - 155
  • [10] Top-k closed co-occurrence patterns mining with differential privacy over multiple streams
    Wang, Jinyan
    Fang, Shijian
    Liu, Chen
    Qin, Jiawen
    Li, Xianxian
    Shi, Zhenkui
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 111 (111): : 339 - 351