On the Efficient Representation of Datasets as Graphs to Mine Maximal Frequent Itemsets

被引:33
作者
Halim, Zahid [1 ]
Ali, Omer [1 ]
Khan, Muhammad Ghufran [1 ]
机构
[1] Ghulam Ishaq Khan Inst Engn Sci & Technol, Machine Intelligence Res Grp MInG, Fac Comp Sci & Engn, Topi 23460, Pakistan
关键词
Itemsets; Data mining; Databases; Data structures; Task analysis; Benchmark testing; Machine intelligence; Efficient frequent itemsets extraction; efficient data structure; graph utility; maximal frequent itemsets; ASSOCIATION RULES; ALGORITHM; PATTERNS;
D O I
10.1109/TKDE.2019.2945573
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent itemsets mining is an active research problem in the domain of data mining and knowledge discovery. With the advances in database technology and an exponential increase in data to be stored, there is a need for efficient approaches that can quickly extract useful information from such large datasets. Frequent Itemsets (FIs) mining is a data mining task to find itemsets in a transactional database which occur together above a certain frequency. Finding these FIs usually requires multiple passes over the databases; therefore, making efficient algorithms crucial for mining FIs. This work presents a graph-based approach for representing a complete transactional database. The proposed graph-based representation enables the storing of all relevant information (for extracting FIs) of the database in one pass. Later, an algorithm that extracts the FIs from the graph-based structure is presented. Experimental results are reported comparing the proposed approach with 17 related FIs mining methods using six benchmark datasets. Results show that the proposed approach performs better than others in terms of time.
引用
收藏
页码:1674 / 1691
页数:18
相关论文
共 69 条
[1]  
Agrawal R., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P244
[2]  
Agrawal R., 1996, ADV KNOWLEDGE DISCOV, P307
[3]  
Agrawal R., 1994, VLDB 94, P487
[4]  
[Anonymous], 2000, ACM SIGMOD workshop on research issues in data mining and knowledge discovery
[5]   negFIN: An efficient algorithm for fast mining frequent itemsets [J].
Aryabarzan, Nader ;
Minaei-Bidgoli, Behrouz ;
Teshnehlab, Mohammad .
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 105 :129-143
[6]   DBV-Miner: A Dynamic Bit-Vector approach for fast mining frequent closed itemsets [J].
Bay Vo ;
Hong, Tzung-Pei ;
Bac Le .
EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (08) :7196-7206
[7]  
Bernecker T, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P119
[8]  
Biswas B., 2016, EMERGING RES COMPUT, V2, P153
[9]  
BRYANT RE, 1986, IEEE T COMPUT, V35, P677, DOI 10.1109/TC.1986.1676819
[10]   MAFIA: A maximal frequent itemset algorithm for transactional databases [J].
Burdick, D ;
Calimlim, M ;
Gehrke, J .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :443-452