A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream

被引:0
|
作者
Fu, Weiqi [1 ]
Liao, Husheng [1 ]
Jin, Xueyun [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY (FMSMT 2017) | 2017年 / 130卷
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
frequent pattern mining; semi-structured data stream; schema feature;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data mining is used to find useful information from massive data. Frequent pattern mining is one important task of data mining. Recently, the researches on frequent pattern mining for semi-structured data have made some progresses, and it also have a lot of focuses for data stream. However, only a few studies focus on both semi-structured data and data stream. This paper proposes an algorithm named SPrefixTreeISpan. We segment the semi-structured data stream first, and then uses the pattern-growth method to mine each segment. In the end, we maintain all the results on a structure called patternTree. At the same time, the mining algorithm is optimized by the inevitable parent-child relationship and the inevitable child-parent relationship extracted from XML schema. Experiment shows that SPrefixTreeISpan has better performance.
引用
收藏
页码:1329 / 1336
页数:8
相关论文
共 50 条
  • [31] Efficient frequent pattern mining based on Linear Prefix tree
    Pyun, Gwangbum
    Yun, Unil
    Ryu, Keun Ho
    KNOWLEDGE-BASED SYSTEMS, 2014, 55 : 125 - 139
  • [32] HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing
    Krishan Kumar Sethi
    Dharavath Ramesh
    The Journal of Supercomputing, 2017, 73 : 3652 - 3668
  • [33] HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing
    Sethi, Krishan Kumar
    Ramesh, Dharavath
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (08) : 3652 - 3668
  • [34] Tidset-based parallel FP-tree algorithm for the frequent pattern mining problem on PC clusters
    Zhou, Jiayi
    Yu, Kun-Ming
    ADVANCES IN GRID AND PERVASIVE COMPUTING, PROCEEDINGS, 2008, 5036 : 18 - 28
  • [35] Knowledge discovery of design rationale based on frequent-pattern mining
    Jiang, H.
    Yang, W.
    Mei, J.
    Wu, R. L.
    Guo, L.
    AUTOMATIC CONTROL, MECHATRONICS AND INDUSTRIAL ENGINEERING, 2019, : 161 - 166
  • [36] A Method Based on Frequent Pattern Mining to Predict Spectral Availability of HF
    Wu, Chujie
    Cheng, Yunpeng
    Gong, Yuping
    Ding, Guoru
    Yu, Ling
    Zhang, Zhe
    2018 IEEE 18TH INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT), 2018, : 1391 - 1396
  • [37] A novel parallel algorithm for frequent pattern mining with privacy preserved in cloud computing environments
    Lin, Kawuu W.
    Deng, Der-Jiunn
    INTERNATIONAL JOURNAL OF AD HOC AND UBIQUITOUS COMPUTING, 2010, 6 (04) : 205 - 215
  • [38] Performance Evaluation of Frequent Pattern Mining Algorithms using Web Log Data for Web Usage Mining
    Gashaw, Yonas
    Liu, Fang
    2017 10TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI), 2017,
  • [39] Use of Context for Recommending Code: an Approach Based on Frequent Pattern Mining
    Mendoza, Paul
    PROCEEDINGS OF THE XVII INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTION INTERACCION 2016, 2016,
  • [40] T-Music: A Melody Composer based on Frequent Pattern Mining
    Long, Cheng
    Wong, Raymond Chi-Wing
    Sze, Raymond Ka Wai
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 1332 - 1335