LASH: Large-Scale Sequence Mining with Hierarchies

被引:10
|
作者
Beedkar, Kaustubh [1 ]
Gemulla, Rainer [1 ]
机构
[1] Univ Mannheim, Data & Web Sci Grp, Mannheim, Germany
关键词
D O I
10.1145/2723372.2723724
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose LASH, a scalable, distributed algorithm for mining sequential patterns in the presence of hierarchies. LASH takes as input a collection of sequences, each composed of items from some application-specific vocabulary. In contrast to traditional approaches to sequence mining, the items in the vocabulary are arranged in a hierarchy: both input sequences and sequential patterns may consist of items from different levels of the hierarchy. Such hierarchies naturally occur in a number of applications including mining natural-language text, customer transactions, error logs, or event sequences. LASH is the first parallel algorithm for mining frequent sequences with hierarchies; it is designed to scale to very large datasets. At its heart, LASH partitions the data using a novel, hierarchy-aware variant of item-based partitioning and subsequently mines each partition independently and in parallel using a customized mining algorithm called pivot sequence miner. LASH is amenable to a MapReduce implementation; we propose effective and efficient algorithms for both the construction and the actual mining of partitions. Our experimental study on large real-world datasets suggest good scalability and run-time efficiency.
引用
收藏
页码:491 / 503
页数:13
相关论文
共 50 条
  • [1] Large-Scale Frequent Episode Mining from Complex Event Sequences with Hierarchies
    Ao, Xiang
    Shi, Haoran
    Wang, Jin
    Zuo, Luo
    Li, Hongwei
    He, Qing
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2019, 10 (04)
  • [2] Hierarchies in the large-scale structures of the Universe
    Goldman, T.
    Perez-Mercader, Juan
    INTERNATIONAL JOURNAL OF MODERN PHYSICS D, 2006, 15 (08): : 1199 - 1215
  • [3] LASH: Large-Scale Academic Deep Semantic Hashing
    Guo, Jia-Nan
    Mao, Xian-Ling
    Lan, Tian
    Tu, Rong-Xin
    Wei, Wei
    Huang, Heyan
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (02) : 1734 - 1746
  • [4] LARGE-SCALE SORTING IN UNIFORM MEMORY HIERARCHIES
    VITTER, JS
    NODINE, MH
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1993, 17 (1-2) : 107 - 114
  • [5] Latent Task Adaptation with Large-scale Hierarchies
    Jia, Yangqing
    Darrell, Trevor
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 2080 - 2087
  • [6] LARGE-SCALE UNDERGROUND MINING
    ALMGREN, G
    SCANDINAVIAN JOURNAL OF METALLURGY, 1987, 16 (01) : 29 - 32
  • [7] Mining of High-Utility Sequence Patterns in Large-Scale Uncertain Databases
    Wu, Jimmy Ming-Tai
    Liu, Shuo
    Lin, Jerry Chun-Wei
    2022 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2022, : 1103 - 1109
  • [8] Large-Scale Pairwise Sequence Alignments on a Large-Scale GPU Cluster
    Savran, Ibrahim
    Gao, Yang
    Bakos, Jason D.
    IEEE DESIGN & TEST, 2014, 31 (01) : 51 - 61
  • [9] Yukon weighs large-scale mining
    Hiyate, Alisha
    Canadian Mining Journal, 2018, 139 (01) : 25 - 28
  • [10] Large-Scale Multimedia Retrieval and Mining
    Yan, Rong
    Huet, Benoit
    Sukthankar, Rahul
    IEEE MULTIMEDIA, 2011, 18 (01) : 11 - 13