Memory-Efficient Sequential Pattern Mining with Hybrid Tries

被引:0
作者
Hosseininasab, Amin [1 ]
van Hoeve, Willem-Jan [2 ]
Cire, Andre A. [3 ]
机构
[1] Univ Florida, Warrington Coll Business, Gainesville, FL 32611 USA
[2] Carnegie Mellon Univ, Tepper Sch Business, Pittsburgh, PA USA
[3] Univ Toronto, Rotman Sch Management, Toronto, ON, Canada
关键词
Sequential pattern mining; Memory efficiency; Large-scale pattern mining; Trie data set models; GENERATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper develops a memory-efficient approach for Sequential Pattern Mining (SPM), a fundamental topic in knowledge discovery that faces a well-known memory bottleneck for large data sets. Our methodology involves a novel hybrid trie data structure that exploits recurring patterns to compactly store the data set in memory; and a corresponding mining algorithm designed to effectively extract patterns from this compact representation. Numerical results on small to medium-sized real-life test instances show an average improvement of 85% in memory consumption and 49% in computation time compared to the state of the art. For large data sets, our algorithm stands out as the only capable SPM approach within 256GB of system memory, potentially saving 1.7TB in memory consumption.
引用
收藏
页数:29
相关论文
共 50 条
  • [41] Smart support functions for sequential pattern mining
    Che, Dunren
    Zheng, Wei
    [J]. JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2006, 6 (5-6) : S255 - S263
  • [42] Closed sequential pattern mining for sitemap generation
    Michelangelo Ceci
    Pasqua Fabiana Lanotte
    [J]. World Wide Web, 2021, 24 : 175 - 203
  • [43] Discretized sequential pattern mining for behaviour classification
    Buffett, Scott
    [J]. GRANULAR COMPUTING, 2021, 6 (04) : 853 - 866
  • [44] Smart support functions for sequential pattern mining
    Department of Computer Science, Southern Illinois University, Carbondale, United States
    [J]. J. Comput. Methods Sci. Eng., 2006, 5-6 (S255-S263): : S255 - S263
  • [45] Benchmarking the effectiveness of sequential pattern mining methods
    Kum, Hye-Chung
    Chang, Joong Hyuk
    Wang, Wei
    [J]. DATA & KNOWLEDGE ENGINEERING, 2007, 60 (01) : 30 - 50
  • [46] A New Approach for Problem of Sequential Pattern Mining
    Nguyen, Thanh-Trung
    Nguyen, Phi-Khu
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE - TECHNOLOGIES AND APPLICATIONS, PT I, 2012, 7653 : 51 - 60
  • [47] Mimir: Memory-Efficient and Scalable MapReduce for Large Supercomputing Systems
    Gao, Tao
    Guo, Yanfei
    Zhang, Boyu
    Cicotti, Pietro
    Lu, Yutong
    Balaji, Pavan
    Taufer, Michela
    [J]. 2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 1098 - 1108
  • [48] TK-RNSP: Efficient Top-K Repetitive Negative Sequential Pattern mining
    Lan, Dun
    Sun, Chuanhou
    Dong, Xiangjun
    Qiu, Ping
    Gong, Yongshun
    Liu, Xinwang
    Fournier-Viger, Philippe
    Zhang, Chengqi
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2025, 62 (03)
  • [49] FMSLIC: Fast Memory-Efficient Structure for Implementation of SLIC on FPGA
    Mighani, Mojtaba
    Khakpour, Ali
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (08) : 5065 - 5078
  • [50] FMSLIC: Fast Memory-Efficient Structure for Implementation of SLIC on FPGA
    Mojtaba Mighani
    Ali Khakpour
    [J]. Circuits, Systems, and Signal Processing, 2023, 42 : 5065 - 5078