A survey of high utility sequential patterns mining methods

被引:0
作者
Zhang, Ruihua [1 ]
Han, Meng [1 ,2 ]
He, Feifei [1 ]
Meng, Fanxing [1 ]
Li, Chunpeng [1 ]
机构
[1] North Minzu Univ, Sch Comp Sci & Engn, Yinchuan, Ningxia, Peoples R China
[2] State Ethn Affairs Commiss, Key Lab Intelligent Proc Image & Graph, Yinchuan, Ningxia, Peoples R China
基金
中国国家自然科学基金;
关键词
Survey; high utility sequential patterns; incremental data; data streams; hidden patterns; EFFICIENT ALGORITHM; PREFIXSPAN; INTERNET;
D O I
10.3233/JIFS-232107
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, there has been an increasing demand for high utility sequential pattern (HUSP) mining. Different from high utility itemset mining, the "combinatorial explosion" problem of sequence data makes it more challenging. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods of HUSP from a novel perspective. Firstly, from the perspective of serial and parallel, the data structure used by the mining methods are illustrated and the pros and cons of the algorithms are summarized. In order to protect data privacy, many HUSP hiding algorithms have been proposed, which are classified into array-based, chain-based and matrix-based algorithms according to the key technologies. The hidden strategies and evaluation metrics adopted by the algorithms are summarized. Next, a taxonomy of the most common and the state-of-the-art approaches for incremental mining algorithms is presented, including tree-based and projection-based. In order to deal with the latest sequence in the data stream, the existing algorithms often use the window model to update dynamically, and the algorithms are divided into methods based on sliding windows and landmark windows for analysis. Afterwards, a summary of derived high utility sequential pattern is presented. Finally, aiming at the deficiencies of the existing HUSP research, the next work that the author plans to do is given.
引用
收藏
页码:8049 / 8077
页数:29
相关论文
共 91 条
[31]   Scalable Mining of High-Utility Sequential Patterns With Three-Tier MapReduce Model [J].
Lin, Jerry Chun-Wei ;
Djenouri, Youcef ;
Srivastava, Gautam ;
Li, Yuanfa ;
Yu, Philip S. .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2022, 16 (03)
[32]   Mining High-Utility Sequential Patterns in Uncertain Databases [J].
Lin, Jerry Chun-Wei ;
Srivastava, Gautam ;
Li, Yuanfa ;
Hong, Tzung-Pei ;
Wang, Shyue-Liang .
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, :5373-5380
[33]   Efficient Chain Structure for High-Utility Sequential Pattern Mining [J].
Lin, Jerry Chun-Wei ;
Li, Yuanfa ;
Fournier-Viger, Philippe ;
Djenouri, Youcef ;
Zhang, Ji .
IEEE ACCESS, 2020, 8 :40714-40722
[34]   High average-utility sequential pattern mining based on uncertain databases [J].
Lin, Jerry Chun-Wei ;
Li, Ting ;
Pirouz, Matin ;
Zhang, Ji ;
Fournier-Viger, Philippe .
KNOWLEDGE AND INFORMATION SYSTEMS, 2020, 62 (03) :1199-1228
[35]  
Lin JCW, 2019, IEEE INT CONF BIG DA, P2674, DOI 10.1109/BigData47090.2019.9005996
[36]   High-Utility Sequential Pattern Mining with Multiple Minimum Utility Thresholds [J].
Lin, Jerry Chun-Wei ;
Zhang, Jiexiong ;
Fournier-Viger, Philippe .
WEB AND BIG DATA, APWEB-WAIM 2017, PT I, 2017, 10366 :215-229
[37]   Fast algorithms for mining high-utility itemsets with various discount strategies [J].
Lin, Jerry Chun-Wei ;
Gan, Wensheng ;
Fournier-Viger, Philippe ;
Hong, Tzung-Pei ;
Tseng, Vincent S. .
ADVANCED ENGINEERING INFORMATICS, 2016, 30 (02) :109-126
[38]  
[吕存伟 Lv Cunwei], 2017, [小型微型计算机系统, Journal of Chinese Computer Systems], V38, P1724
[39]   An Approach to Decrease Execution Time and Difference for Hiding High Utility Sequential Patterns [J].
Minh Nguyen Quang ;
Ut Huynh ;
Tai Dinh ;
Nghia Hoai Le ;
Bac Le .
INTEGRATED UNCERTAINTY IN KNOWLEDGE MODELLING AND DECISION MAKING, IUKM 2016, 2016, 9978 :435-446
[40]  
Quang MN, 2016, INT CONF KNOWL SYS, P13, DOI 10.1109/KSE.2016.7758022