A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases

被引:108
作者
Ahmed, Chowdhury Farhan [1 ]
Tanbeer, Syed Khairuzzaman [1 ]
Jeong, Byeong-Soo [1 ]
机构
[1] Kyung Hee Univ, Database Lab, Dept Comp Engn, Coll Elect & Informat, Yongin, South Korea
关键词
Data mining; sequential patterns; high-utility patterns; knowledge discovery;
D O I
10.4218/etrij.10.1510.0066
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Mining sequential patterns is an important research issue in data mining and knowledge discovery with broad applications. However, the existing sequential pattern mining approaches consider only binary frequency values of items in sequences and equal importance/significance values of distinct items. Therefore, they are not applicable to actually represent many real-world scenarios. In this paper, we propose a novel framework for mining high-utility sequential patterns for more real-life applicable information extraction from sequence databases with non-binary frequency values of items in sequences and different importance/significance values for distinct items. Moreover, for mining high-utility sequential patterns, we propose two new algorithms: Utility Level is a high-utility sequential pattern mining with a level-wise candidate generation approach, and Utility Span is a high-utility sequential pattern mining with a pattern growth approach. Extensive performance analyses show that our algorithms are very efficient and scalable for mining high-utility sequential patterns.
引用
收藏
页码:676 / 686
页数:11
相关论文
共 23 条
[1]  
AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
[2]  
Agrawal R., 1994, P 20 INT C VER LARG, P487, DOI DOI 10.5555/645920.672836
[3]   Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Lee, Young-Koo .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (12) :1708-1721
[4]  
Ahmed CF, 2009, LECT NOTES ARTIF INT, V5476, P749, DOI 10.1007/978-3-642-01307-2_76
[5]  
[Anonymous], FREQUENT ITEMSET MIN
[6]  
Ayres J., 2002, P ACM SIGKDD INT C K, P429
[7]  
Garofalakis MN, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P223
[8]   Frequent pattern mining: current status and future directions [J].
Han, Jiawei ;
Cheng, Hong ;
Xin, Dong ;
Yan, Xifeng .
DATA MINING AND KNOWLEDGE DISCOVERY, 2007, 15 (01) :55-86
[9]   Mining frequent patterns without candidate generation: A frequent-pattern tree approach [J].
Han, JW ;
Pei, J ;
Yin, YW ;
Mao, RY .
DATA MINING AND KNOWLEDGE DISCOVERY, 2004, 8 (01) :53-87
[10]   SQUIRE: Sequential pattern mining with quantities [J].
Kim, Chulyun ;
Lim, Jong-Hwa ;
Ng, Raymond T. ;
Shim, Kyuseok .
JOURNAL OF SYSTEMS AND SOFTWARE, 2007, 80 (10) :1726-1745