Accelerated Frequent Closed Sequential Pattern Mining for uncertain data

被引:7
|
作者
You, Tao [1 ]
Sun, Yue [1 ]
Zhang, Ying [1 ]
Chen, Jinchao [1 ]
Zhang, Peng [1 ]
Yang, Mei [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Xian 610072, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Uncertain database; Frequent closed sequences; Possible world semantics; SEQUENCES; ALGORITHM; ITEMSETS;
D O I
10.1016/j.eswa.2022.117254
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data uncertainty has been taken into a consideration for mining and discovery of its hidden knowledge in a variety of applications. Due to the fact that the nature of closed sequences is closely related to possible world, more recent studies on uncertain closed sequential data mining has usually been challenged by the explosive possible worlds, which is exponential to the number of uncertain sequences in the database. Although basic Probabilistic Frequent Closed Sequences Mining (PFCSM-FF) strategy can solve this problem preliminarily, the inclusion-exclusion rules and closure checking methods used in PFCSM-FF makes mining algorithm very inefficient. And on this basis, another two improvements, PFCSM-CF and PFCSM-CC algorithms, are designed to reduce the search space and simplify the candidate sequence database, which significantly compress the computational scale. Substantial experiments on the real and synthetic datasets have demonstrated the efficiency improvement on the proposed PFCSM-CC and PFCSM-CF methods. Besides, the high usability of the proposed PFCSM-CC algorithm is further demonstrated according to the similarity of the time spent on existing probabilistic frequent sequence mining algorithm.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Fuzzy Association Rule Mining based Frequent Pattern Extraction from Uncertain Data
    Rajput, D. S.
    Thakur, R. S.
    Thakur, G. S.
    PROCEEDINGS OF THE 2012 WORLD CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGIES, 2012, : 709 - 714
  • [32] Mining Probabilistically Frequent Sequential Patterns in Large Uncertain Databases
    Zhao, Zhou
    Yan, Da
    Ng, Wilfred
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (05) : 1171 - 1184
  • [33] NetNCSP: Nonoverlapping closed sequential pattern mining
    Wu, Youxi
    Zhu, Changrui
    Li, Yan
    Guo, Lei
    Wu, Xindong
    KNOWLEDGE-BASED SYSTEMS, 2020, 196 (196)
  • [34] Closed sequential pattern mining for sitemap generation
    Michelangelo Ceci
    Pasqua Fabiana Lanotte
    World Wide Web, 2021, 24 : 175 - 203
  • [35] Closed sequential pattern mining for sitemap generation
    Ceci, Michelangelo
    Lanotte, Pasqua Fabiana
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2021, 24 (01): : 175 - 203
  • [36] Efficient mining of frequent closed XML query pattern
    Feng, Jian-Hua
    Qian, Qian
    Wang, Jian-Yong
    Zhou, Li-Zhu
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2007, 22 (05) : 725 - 735
  • [37] Efficient Mining of Frequent Closed XML Query Pattern
    冯建华
    钱乾
    王建勇
    周立柱
    JournalofComputerScience&Technology, 2007, (05) : 725 - 735
  • [38] Efficient Mining of Frequent Closed XML Query Pattern
    Jian-Hua Feng
    Qian Qian
    Jian-Yong Wang
    Li-Zhu Zhou
    Journal of Computer Science and Technology, 2007, 22 : 725 - 735
  • [39] Frequent sequential pattern mining under differential privacy
    Lu, Guoqing
    Zhang, Xiaojian
    Ding, Liping
    Li, Yanfeng
    Liao, Xin
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2015, 52 (12): : 2789 - 2801
  • [40] Finding frequent trajectories by clustering and sequential pattern mining
    Arthur A.Shaw
    N.P.Gopalan
    Journal of Traffic and Transportation Engineering(English Edition), 2014, 1 (06) : 393 - 403