High-utility sequential pattern mining in incremental database

被引:1
作者
Yan, Huizhen [1 ]
Li, Fengyang [1 ]
Hsieh, Ming-Chia [2 ]
Wu, Jimmy Ming-Tai [3 ]
机构
[1] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Peoples R China
[2] I Shou Univ, Dept Tourism Intelligence Serv & Technol, Kaohsiung, Taiwan
[3] Natl Kaohsiung Univ Sci & Technol, Dept Informat Management, Kaohsiung, Taiwan
关键词
High-utility sequential patterns (HUSPs); Big data; Incremental mining; Data mining; Pre-large sequence;
D O I
10.1007/s11227-024-06568-x
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Previous algorithms designed for efficient mining of sequence patterns have primarily focused on processing static databases. However, in the context of dynamic database mining, where new data are constantly added, rescanning the entire database to update the information becomes necessary. This maintenance and update process consumes significant time and resources, leading to delayed responses. To address this issue, this paper proposes an incremental mining algorithm called Pre-HUSPM, which leverages the concept of pre-large to insert new sequences into the dynamic database while preserving the discovered efficient sequence patterns. Furthermore, a novel threshold, denoted as SWUmax\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$SWU_{max}$$\end{document}, is introduced to minimize the frequency of database rescans and enhance the algorithm's speed. The experimental results show that the algorithm greatly reduces computation time and resource consumption, enabling the algorithm to respond faster to data changes and generate new mining results. This algorithm aids manufacturers in designing and producing products that align with customer preferences based on previous products, thereby improving operational efficiency and guiding customers toward wise purchasing decisions, ultimately resulting in higher profits for the company.
引用
收藏
页数:34
相关论文
共 46 条
[1]  
Agrawal R., 1993, SIGMOD Record, V22, P207, DOI 10.1145/170036.170072
[2]   DATABASE MINING - A PERFORMANCE PERSPECTIVE [J].
AGRAWAL, R ;
IMIELINSKI, T ;
SWAMI, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1993, 5 (06) :914-925
[3]  
Agrawal R., 1994, PROC 20 INT C VERY L, P487, DOI DOI 10.1007/BF02948845
[4]  
Ahmed Chowdhury Farhan, 2010, Proceedings of the 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD 2010), P76, DOI 10.1109/SNPD.2010.21
[5]   A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo .
ETRI JOURNAL, 2010, 32 (05) :676-686
[6]   Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Lee, Young-Koo .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (12) :1708-1721
[7]  
Chan R, 2003, THIRD IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, P19
[8]   An efficient utility-list based high-utility itemset mining algorithm [J].
Cheng, Zaihe ;
Fang, Wei ;
Shen, Wei ;
Lin, Jerry Chun-Wei ;
Yuan, Bo .
APPLIED INTELLIGENCE, 2023, 53 (06) :6992-7006
[9]  
Fournier-Viger Philippe, 2012, Advances in Artificial Intelligence. Proceedings 25th Canadian Conference on Artificial Intelligence, Canadian AI 2012, P61, DOI 10.1007/978-3-642-30353-1_6
[10]  
Fournier-Viger P., 2017, Data Sci. Pattern Recogn, V1, P54