A Schema Feature Based Frequent Pattern Mining Algorithm for Semi-structured Data Stream

被引:0
|
作者
Fu, Weiqi [1 ]
Liao, Husheng [1 ]
Jin, Xueyun [1 ]
机构
[1] Beijing Univ Technol, Fac Informat Technol, Beijing, Peoples R China
来源
PROCEEDINGS OF THE 2017 5TH INTERNATIONAL CONFERENCE ON FRONTIERS OF MANUFACTURING SCIENCE AND MEASURING TECHNOLOGY (FMSMT 2017) | 2017年 / 130卷
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
frequent pattern mining; semi-structured data stream; schema feature;
D O I
暂无
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Data mining is used to find useful information from massive data. Frequent pattern mining is one important task of data mining. Recently, the researches on frequent pattern mining for semi-structured data have made some progresses, and it also have a lot of focuses for data stream. However, only a few studies focus on both semi-structured data and data stream. This paper proposes an algorithm named SPrefixTreeISpan. We segment the semi-structured data stream first, and then uses the pattern-growth method to mine each segment. In the end, we maintain all the results on a structure called patternTree. At the same time, the mining algorithm is optimized by the inevitable parent-child relationship and the inevitable child-parent relationship extracted from XML schema. Experiment shows that SPrefixTreeISpan has better performance.
引用
收藏
页码:1329 / 1336
页数:8
相关论文
共 50 条
  • [21] An Efficient Spark-Based Hybrid Frequent Itemset Mining Algorithm for Big Data
    Al-Bana, Mohamed Reda
    Farhan, Marwa Salah
    Othman, Nermin Abdelhakim
    DATA, 2022, 7 (01)
  • [22] A Knowledge Management Framework for Imbalanced Data using Frequent Pattern Mining Based on Bloom Filter
    El-Ghamrawy, Sally M.
    PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 226 - 231
  • [23] A Comparative Analysis of Frequent Pattern Mining Algorithms Used for Streaming Data
    Shalini
    Jain, Sanjay Kumar
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 250 - 255
  • [24] CLPSO - Fuzzy Frequent Pattern Mining from Gene Expression Data
    Mishra, Shruti
    Satapathy, Sandeep Ku.
    Mishra, Debahuti
    2ND INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION, CONTROL AND INFORMATION TECHNOLOGY (C3IT-2012), 2012, 4 : 807 - 811
  • [25] Parallel TID-based frequent pattern mining algorithm on a PC Cluster and grid computing system
    Yu, Kun-Ming
    Zhou, Jiayi
    EXPERT SYSTEMS WITH APPLICATIONS, 2010, 37 (03) : 2486 - 2494
  • [26] Privacy Preserving Frequent Pattern Mining Based on Grouping Randomization
    Guo Y.-H.
    Tong Y.-H.
    Su Y.-Q.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (12): : 3929 - 3944
  • [27] RESEARCH ON PARALLEL FREQUENT PATTERN MINING BASED ON ONTOLOGY AND RULES
    Yi, Chenxi
    Sun, Ming
    4TH INTERNATIONAL CONFERENCE ON SMART AND SUSTAINABLE CITY (ICSSC 2017), 2017, : 33 - 37
  • [28] Multi-level Frequent Pattern Mining on Pipeline Incident Data
    Hryhoruk, Connor C. J.
    Leung, Carson K.
    Li, Jingyuan
    Narine, Brandon A.
    Wedel, Felix
    ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 2, AINA 2024, 2024, 200 : 380 - 392
  • [29] An OpenCL Candidate Slicing Frequent Pattern Mining Algorithm on Graphic Processing Units
    Lin, Che-Yu
    Yu, Kun-Ming
    Ouyang, Wen
    Zhou, Jiayi
    2011 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2011, : 2344 - 2349
  • [30] An efficient frequent pattern mining algorithm using a highly compressed prefix tree
    Zhu, Xiaolin
    Liu, Yongguo
    INTELLIGENT DATA ANALYSIS, 2019, 23 : S153 - S173