A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream

被引:0
|
作者
Fu, Weiqi [1 ]
Liao, Husheng [1 ]
Jin, Xueyun [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY (FMSMT 2017) | 2017年 / 130卷
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
frequent pattern mining; semi-structured data stream; schema feature;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data mining is used to find useful information from massive data. Frequent pattern mining is one important task of data mining. Recently, the researches on frequent pattern mining for semi-structured data have made some progresses, and it also have a lot of focuses for data stream. However, only a few studies focus on both semi-structured data and data stream. This paper proposes an algorithm named SPrefixTreeISpan. We segment the semi-structured data stream first, and then uses the pattern-growth method to mine each segment. In the end, we maintain all the results on a structure called patternTree. At the same time, the mining algorithm is optimized by the inevitable parent-child relationship and the inevitable child-parent relationship extracted from XML schema. Experiment shows that SPrefixTreeISpan has better performance.
引用
收藏
页码:1329 / 1336
页数:8
相关论文
共 50 条
  • [1] A Real-Time Frequent Pattern Mining Algorithm for Semi Structured Data Streams
    Tong, Ziqi
    Liao, Husheng
    Jin, Xueyun
    2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2017, : 274 - 280
  • [2] An optimal text compression algorithm based on frequent pattern mining
    Oswald, C.
    Sivaselvan, B.
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2018, 9 (03) : 803 - 822
  • [3] An optimal text compression algorithm based on frequent pattern mining
    C. Oswald
    B. Sivaselvan
    Journal of Ambient Intelligence and Humanized Computing, 2018, 9 : 803 - 822
  • [4] Novel Frequent Pattern Mining Algorithm based on Parallelization scheme
    Gatuha, George
    Jiang, Tao
    INTERNATIONAL JOURNAL OF ENGINEERING RESEARCH IN AFRICA, 2016, 23 : 131 - 140
  • [5] A novel frequent pattern mining technique for prediction of user behavior on web stream data
    Dhanalakshmi P.
    Ingenierie des Systemes d'Information, 2019, 24 (01): : 51 - 56
  • [6] Frequent Pattern Mining Based On Imperative Tabularized Apriori Algorithm (ITAA)
    Tanna, Paresh
    Ghodasara, Yogesh
    2015 IEEE INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND COMMUNICATION TECHNOLOGIES, 2015,
  • [7] Frequent Pattern Mining for Kernel Trace Data
    LaRosa, Christopher
    Xiong, Li
    Mandelberg, Ken
    APPLIED COMPUTING 2008, VOLS 1-3, 2008, : 880 - 885
  • [8] A Frequent Pattern Mining Algorithm for Understanding Genetic Algorithms
    Le, Minh Nghia
    Ong, Yew Soon
    ADVANCED INTELLIGENT COMPUTING THEORIES AND APPLICATIONS, PROCEEDINGS: WITH ASPECTS OF ARTIFICIAL INTELLIGENCE, 2008, 5227 : 131 - 139
  • [9] A Disk-Based Mining Algorithm for Frequent Pattern Discovery from Big Data in Distributed Computing Environments
    Lin, Kawuu W.
    Chung, Sheng-Hao
    Hsiao, Chun-Yuan
    Lin, Chun-Cheng
    Chen, Pei-Ling
    JOURNAL OF INTERNET TECHNOLOGY, 2016, 17 (06): : 1259 - 1268
  • [10] Constrained frequent pattern mining on univariate uncertain data
    Liu, Ying-Ho
    Wang, Chun-Sheng
    JOURNAL OF SYSTEMS AND SOFTWARE, 2013, 86 (03) : 759 - 778