Memory-Efficient Sequential Pattern Mining with Hybrid Tries

被引:0
|
作者
Hosseininasab, Amin [1 ]
van Hoeve, Willem-Jan [2 ]
Cire, Andre A. [3 ]
机构
[1] Univ Florida, Warrington Coll Business, Gainesville, FL 32611 USA
[2] Carnegie Mellon Univ, Tepper Sch Business, Pittsburgh, PA USA
[3] Univ Toronto, Rotman Sch Management, Toronto, ON, Canada
关键词
Sequential pattern mining; Memory efficiency; Large-scale pattern mining; Trie data set models; GENERATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper develops a memory-efficient approach for Sequential Pattern Mining (SPM), a fundamental topic in knowledge discovery that faces a well-known memory bottleneck for large data sets. Our methodology involves a novel hybrid trie data structure that exploits recurring patterns to compactly store the data set in memory; and a corresponding mining algorithm designed to effectively extract patterns from this compact representation. Numerical results on small to medium-sized real-life test instances show an average improvement of 85% in memory consumption and 49% in computation time compared to the state of the art. For large data sets, our algorithm stands out as the only capable SPM approach within 256GB of system memory, potentially saving 1.7TB in memory consumption.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] An Efficient Approach for Mining Sequential Pattern
    Pant, Nidhi
    Kant, Surya
    Pant, Bhaskar
    Sharma, Shashi Kumar
    PROCEEDINGS OF FIFTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2015), VOL 2, 2016, 437 : 587 - 596
  • [2] Efficient weighted sequential pattern mining
    Chen, Shaotao
    Chen, Jiahui
    Wan, Shicheng
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 243
  • [3] Efficient sequential pattern mining with wildcards for keyphrase extraction
    Xie, Fei
    Wu, Xindong
    Zhu, Xingquan
    KNOWLEDGE-BASED SYSTEMS, 2017, 115 : 27 - 39
  • [4] A Memory-Efficient Hybrid Parallel Framework for Deep Neural Network Training
    Li, Dongsheng
    Li, Shengwei
    Lai, Zhiquan
    Fu, Yongquan
    Ye, Xiangyu
    Cai, Lei
    Qiao, Linbo
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (04) : 577 - 591
  • [5] Towards Efficient Sequential Pattern Mining in Temporal Uncertain Databases
    Ge, Jiaqi
    Xia, Yuni
    Wang, Jian
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PART II, 2015, 9078 : 268 - 279
  • [6] Memory-Efficient Assembly Using Flye
    Freire, Borja
    Ladra, Susana
    Parama, Jose R.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2022, 19 (06) : 3564 - 3577
  • [7] Differentiable Slimming for Memory-Efficient Transformers
    Penkov, Nikolay
    Balaskas, Konstantinos
    Rapp, Martin
    Henkel, Joerg
    IEEE EMBEDDED SYSTEMS LETTERS, 2023, 15 (04) : 186 - 189
  • [8] Memory-Efficient Minimax Distance Measures
    Hoseini, Fazeleh
    Chehreghani, Morteza Haghir
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT I, 2022, 13280 : 419 - 431
  • [9] CCSMP: an efficient closed contiguous sequential pattern mining algorithm with a pattern relation graph
    Hu, Haichuan
    Zhang, Jingwei
    Xia, Ruiqing
    Liu, Shichao
    APPLIED INTELLIGENCE, 2023, 53 (24) : 29723 - 29740
  • [10] CCSMP: an efficient closed contiguous sequential pattern mining algorithm with a pattern relation graph
    Haichuan Hu
    Jingwei Zhang
    Ruiqing Xia
    Shichao Liu
    Applied Intelligence, 2023, 53 : 29723 - 29740