PaMPa-HD: a Parallel MapReduce-based frequent Pattern miner for High-Dimensional data

被引:13
作者
Apiletti, Daniele [1 ]
Baralis, Elena [1 ]
Cerquitelli, Tania [1 ]
Garza, Paolo [1 ]
Pulvirenti, Fabio [1 ]
Michiardi, Pietro [2 ]
机构
[1] Politecn Torino, Dipartimento Automat & Informat, Turin, Italy
[2] Eurecom, Sophia Antipolis, France
来源
2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW) | 2015年
关键词
D O I
10.1109/ICDMW.2015.18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent closed itemset mining is among the most complex exploratory techniques in data mining, and provides the ability to discover hidden correlations in transactional datasets. The explosion of Big Data is leading to new parallel and distributed approaches. Unfortunately, most of them are designed to cope with low-dimensional datasets, whereas no distributed high-dimensional frequent closed itemset mining algorithms exists. This work introduces PaMPa-HD, a parallel MapReduce-based frequent closed itemset mining algorithm for high-dimensional datasets, based on Carpenter. The experimental results, performed on both real and synthetic datasets, show the efficiency and scalability of PaMPa-HD.
引用
收藏
页码:839 / 846
页数:8
相关论文
共 19 条
[1]  
Agrawal R., P 20 INT C VERY LARG
[2]  
[Anonymous], 2007, The hadoop distributed file system: Architecture and design
[3]  
[Anonymous], NETFL
[4]  
[Anonymous], 2004, OSDI 04
[5]  
[Anonymous], 2012, NSDI
[6]  
APILETTI D, 2013, TRUST SEC PRIV COMP, P1283, DOI DOI 10.1109/TRUSTCOM.2013.153
[7]   Characterizing network traffic by means of the NETMINE framework [J].
Apiletti, Daniele ;
Baralis, Elena ;
Cerquitelli, Tania ;
D'Elia, Vincenzo .
COMPUTER NETWORKS, 2009, 53 (06) :774-789
[8]  
Borgelt C., 2011, P 14 INT C EXTENDING, P367
[9]   Anomaly Extraction in Backbone Networks Using Association Rules [J].
Brauckhoff, Daniela ;
Dimitropoulos, Xenofontas ;
Wagner, Arno ;
Salamatian, Kave .
IEEE-ACM TRANSACTIONS ON NETWORKING, 2012, 20 (06) :1788-1799
[10]  
Cuturi M., 2011, UCI MACHINE LEARNING