An Efficient Method for Mining Top-K Closed Sequential Patterns

被引:10
作者
Pham, Thi-Thiet [1 ]
Do, Tung [2 ]
Nguyen, Anh [3 ]
Vo, Bay [4 ]
Hong, Tzung-Pei [5 ,6 ]
机构
[1] Ind Univ Ho Chi Minh City, Fac Informat Technol, Ho Chi Minh City 700000, Vietnam
[2] Van Lang Univ, Fac Basic Sci, Ho Chi Minh City 700000, Vietnam
[3] Duy Tan Univ, Inst Res & Dev, Da Nang 550000, Vietnam
[4] Ho Chi Minh City Univ Technol HUTECH, Fac Informat Technol, Ho Chi Minh City 700000, Vietnam
[5] Natl Univ Kaohsiung, Dept Comp Sci & Informat Engn, Kaohsiung 811, Taiwan
[6] Natl Sun Yat Sen Univ, Dept Comp Sci & Engn, Kaohsiung 804, Taiwan
关键词
Closed sequential pattern; data mining; sequential pattern; top-k sequential patterns; FREQUENT PATTERNS; ALGORITHMS;
D O I
10.1109/ACCESS.2020.3004528
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of exploiting Closed Sequential Patterns (CSPs) is an essential task in data mining, with many different applications. It is used to resolve the situations of huge databases or low minimum support (minsup) thresholds in mining sequential patterns. However, it is challenging and needs a lot of time to customize the minsup values for generating appropriate numbers of CSPs desired by users. To conquer this issue, the TSP algorithm for mining top-k CSPs was previously proposed, with k being a given parameter. The algorithm would return the k CSPs which have the highest support values in a database. However, its execution time and memory usage were high. In this paper, an algorithm named TKCS (Top-K Closed Sequences) is proposed to mine the top-k CSPs efficiently. To improve the execution time and memory usage, it uses a vertical bitmap database to represent data. Besides, it adopts some useful strategies in the process of exploiting the top-k CSPs such as: always choosing the sequential patterns with the greatest support values for generating candidate patterns and storing top-k CSPs in an ascending order of the support values to increase the minsup value more quickly. The empirical results show that TKCS has better performance than TSP for discovering the top-k CSPs in terms of both runtime and memory usage.
引用
收藏
页码:118156 / 118163
页数:8
相关论文
共 50 条
  • [21] An efficient algorithm for mining top-k on-shelf high utility itemsets
    Thu-Lan Dam
    Li, Kenli
    Fournier-Viger, Philippe
    Quang-Huy Duong
    KNOWLEDGE AND INFORMATION SYSTEMS, 2017, 52 (03) : 621 - 655
  • [22] ETARM: an efficient top-k association rule mining algorithm
    Linh T. T. Nguyen
    Bay Vo
    Loan T. T. Nguyen
    Philippe Fournier-Viger
    Ali Selamat
    Applied Intelligence, 2018, 48 : 1148 - 1160
  • [23] ITUFP: A fast method for interactive mining of Top-K frequent patterns from uncertain data
    Davashi, Razieh
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 214
  • [24] Interactive mining of top-K frequent closed itemsets from data streams
    Li, Hua-Fu
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (07) : 10779 - 10788
  • [25] One Database Pass Algorithms of Mining Top-k Frequent Closed Itemsets
    Qiu, Yong
    Lan, Yong-Jie
    ICCSSE 2009: PROCEEDINGS OF 2009 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, 2009, : 828 - 833
  • [26] Efficient top-k high utility itemset mining on massive data
    Han, Xixian
    Liu, Xianmin
    Li, Jianzhong
    Gao, Hong
    INFORMATION SCIENCES, 2021, 557 : 382 - 406
  • [27] Targeted mining of top-k high utility itemsets
    Huang, Shan
    Gan, Wensheng
    Miao, Jinbao
    Han, Xuming
    Fournier-Viger, Philippe
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [28] Efficient Mining of Robust Closed Weighted Sequential Patterns Without Information Loss
    Yun, Unil
    Pyun, Gwangbum
    Yoon, Eunchul
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2015, 24 (01)
  • [29] An efficient method for mining sequential patterns with indices
    Huynh, Huy Minh
    Nguyen, Loan T. T.
    Pham, Nam Ngoc
    Oplatkova, Zuzana Kominkova
    Yun, Unil
    Vo, Bay
    KNOWLEDGE-BASED SYSTEMS, 2022, 239
  • [30] An efficient algorithm for mining top-k closed frequent item sets over data streams over data streams
    Yimin, Mao
    Xiaofang, Xue
    Jinqing, Chen
    Telkomnika - Indonesian Journal of Electrical Engineering, 2013, 11 (07): : 3759 - 3766