ITUFP: A fast method for interactive mining of Top-K frequent patterns from uncertain data

被引:8
作者
Davashi, Razieh [1 ,2 ]
机构
[1] Islamic Azad Univ, Fac Comp Engn, Najafabad Branch, Najafabad, Iran
[2] Islamic Azad Univ, Big Data Res Ctr, Najafabad Branch, Najafabad, Iran
关键词
Data mining; Frequent pattern mining; Uncertain frequent pattern; Uncertain data; Interactive mining; ITEMSETS; TREE; THRESHOLD; SUPPORT;
D O I
10.1016/j.eswa.2022.119156
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Top-K Uncertain Frequent Pattern (UFP) mining is an interesting topic in data mining. The existing TUFP algorithm supports static mining of Top-K UFPs; however, in the real world, users need to repeatedly change the K threshold to extract the information according to the requirements of their application. In interactive environments, the TUFP algorithm needs to re-scan the database and create the UP-Lists and CUP-Lists from scratch which is very time-consuming. In this paper, a fast method called ITUFP is proposed for interactive mining of Top-K UFPs. The proposed method uses a new data structure called IMCUP-List to store information of patterns efficiently. It creates the UP-Lists with a single database scan, extracts the patterns by generating IMCUP-Lists, and stores all the lists. When K changes, the proposed algorithm only updates the IMCUP-Lists without having to create the lists from scratch. Accordingly, ITUFP conforms to the "build once, mine many" principle, where the UP-Lists and IMCUP-Lists are created only once and used in mining with different K values. This is the first study on interactive mining of Top-K UFPs. Extensive experimental results with sparse and dense uncertain data prove that the proposed method is very efficient for interactive mining of Top-K UFPs.
引用
收藏
页数:15
相关论文
共 48 条
[1]  
Ada Wai-Chee Fu, 2000, Foundations of Intelligent Systems. 12th International Symposium, ISMIS 2000. Proceedings (Lecture Notes in Artificial Intelligence Vol.1932), P59
[2]  
Aggarwal CC, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P29
[3]  
Agrawal R., 1994, P 20 INT C VER LARG, P487, DOI DOI 10.5555/645920.672836
[4]   Interactive mining of high utility patterns over data streams [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Choi, Ho-Jin .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (15) :11979-11991
[5]   Single-pass incremental and interactive mining for weighted frequent patterns [J].
Ahmed, Chowdhury Farhan ;
Tanbeer, Syed Khairuzzaman ;
Jeong, Byeong-Soo ;
Lee, Young-Koo ;
Choi, Ho-Jin .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (09) :7976-7994
[6]  
[Anonymous], 2010, COMPUTER INFORM SCI, DOI DOI 10.5539/CIS.V3N2P171
[7]  
Bhadoria R. S., 2011, Proceedings of the 2011 World Congress on Information and Communication Technologies (WICT), P263, DOI 10.1109/WICT.2011.6141255
[8]  
Calders T, 2010, LECT NOTES ARTIF INT, V6118, P480
[9]   Mining frequent itemsets without support threshold: With and without item constraints [J].
Cheung, YL ;
Fu, AWC .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (09) :1052-1069
[10]  
Chui CK, 2008, LECT NOTES ARTIF INT, V5012, P64, DOI 10.1007/978-3-540-68125-0_8