An Efficient Method for Mining Top-K Closed Sequential Patterns

被引:10
作者
Pham, Thi-Thiet [1 ]
Do, Tung [2 ]
Nguyen, Anh [3 ]
Vo, Bay [4 ]
Hong, Tzung-Pei [5 ,6 ]
机构
[1] Ind Univ Ho Chi Minh City, Fac Informat Technol, Ho Chi Minh City 700000, Vietnam
[2] Van Lang Univ, Fac Basic Sci, Ho Chi Minh City 700000, Vietnam
[3] Duy Tan Univ, Inst Res & Dev, Da Nang 550000, Vietnam
[4] Ho Chi Minh City Univ Technol HUTECH, Fac Informat Technol, Ho Chi Minh City 700000, Vietnam
[5] Natl Univ Kaohsiung, Dept Comp Sci & Informat Engn, Kaohsiung 811, Taiwan
[6] Natl Sun Yat Sen Univ, Dept Comp Sci & Engn, Kaohsiung 804, Taiwan
关键词
Closed sequential pattern; data mining; sequential pattern; top-k sequential patterns; FREQUENT PATTERNS; ALGORITHMS;
D O I
10.1109/ACCESS.2020.3004528
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of exploiting Closed Sequential Patterns (CSPs) is an essential task in data mining, with many different applications. It is used to resolve the situations of huge databases or low minimum support (minsup) thresholds in mining sequential patterns. However, it is challenging and needs a lot of time to customize the minsup values for generating appropriate numbers of CSPs desired by users. To conquer this issue, the TSP algorithm for mining top-k CSPs was previously proposed, with k being a given parameter. The algorithm would return the k CSPs which have the highest support values in a database. However, its execution time and memory usage were high. In this paper, an algorithm named TKCS (Top-K Closed Sequences) is proposed to mine the top-k CSPs efficiently. To improve the execution time and memory usage, it uses a vertical bitmap database to represent data. Besides, it adopts some useful strategies in the process of exploiting the top-k CSPs such as: always choosing the sequential patterns with the greatest support values for generating candidate patterns and storing top-k CSPs in an ascending order of the support values to increase the minsup value more quickly. The empirical results show that TKCS has better performance than TSP for discovering the top-k CSPs in terms of both runtime and memory usage.
引用
收藏
页码:118156 / 118163
页数:8
相关论文
共 50 条
  • [31] Mining Top-k Minimal Redundancy Frequent Patterns over Uncertain Databases
    Wang, Haishuai
    Zhang, Peng
    Wu, Jia
    Pan, Shirui
    NEURAL INFORMATION PROCESSING, ICONIP 2015, PT IV, 2015, 9492 : 111 - 119
  • [32] Prefix-projection global constraint and top-k approach for sequential pattern mining
    Kemmar, Amina
    Lebbah, Yahia
    Loudni, Samir
    Boizumault, Patrice
    Charnois, Thierry
    CONSTRAINTS, 2017, 22 (02) : 265 - 306
  • [33] TGP: Mining Top-K Frequent Closed Graph Pattern without Minimum Support
    Li, Yuhua
    Lin, Quan
    Li, Ruixuan
    Duan, Dongsheng
    ADVANCED DATA MINING AND APPLICATIONS, ADMA 2010, PT I, 2010, 6440 : 537 - 548
  • [34] Mining closed sequential patterns with time constraints
    Lin, Ming-Yen
    Hsueh, Sue-Chen
    Chang, Chia-Wen
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2008, 24 (01) : 33 - 46
  • [35] Prefix-projection global constraint and top-k approach for sequential pattern mining
    Amina Kemmar
    Yahia Lebbah
    Samir Loudni
    Patrice Boizumault
    Thierry Charnois
    Constraints, 2017, 22 : 265 - 306
  • [36] An efficient algorithm for mining top-k on-shelf high utility itemsets
    Thu-Lan Dam
    Kenli Li
    Philippe Fournier-Viger
    Quang-Huy Duong
    Knowledge and Information Systems, 2017, 52 : 621 - 655
  • [37] Efficient All Top-k Computation-A Unified Solution for All Top-k, Reverse Top-k and Top-m Influential Queries
    Ge, Shen
    U, Leong Hou
    Mamoulis, Nikos
    Cheung, David W.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (05) : 1015 - 1027
  • [38] Fast mining of closed sequential patterns
    Department of Computer Science and Information Engineering, Tamkang University, 151 Ying-Chuan Road, Tamsui, Taipei, Taiwan
    WSEAS Trans. Comput., 2008, 3 (133-139):
  • [39] Mining Weighted a Closed Sequential Patterns in Large Databases
    Ren, Jia-Dong
    Yang, Jing
    Li, Yan
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 5, PROCEEDINGS, 2008, : 640 - 644
  • [40] TKAR: Efficient Mining of Top-k Association Rules on Real-Life Datasets
    Gireesha, O.
    Obulesu, O.
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON FRONTIERS IN INTELLIGENT COMPUTING: THEORY AND APPLICATIONS, (FICTA 2016), VOL 2, 2017, 516 : 45 - 54