A Distributed Framework for Event Log Analysis using MapReduce

被引:0
作者
Dewangan, Sandeep Kumar [1 ]
Pandey, Shikha [1 ]
Verma, Toran [1 ]
机构
[1] RCET, Dept Comp Sci & Engn, Bhilai 490024, India
来源
PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT) | 2016年
关键词
User Sessionization; Hadoop; MapReduce; Event Log Files;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This Event log file is the most common datasets exploited by many companies for customer behavior analysis. Oftentimes these records are unordered, and need to be grouped by certain key for effective analysis. One such example is to group similar user with different session ID to facilitate further analysis. This kind of analysis is known as User Sessionization. In this paper, we propose a distributed framework in combination of Hadoop and MapReduce to analyze event log file and sessionize user based on IP-address and timestamp. The evaluation results show that as the number of nodes increases the execution time decreases and performance increases.
引用
收藏
页码:503 / 506
页数:4
相关论文
共 10 条
[1]  
Berendt B., 2002, WEBKDD 2002 MINING W
[2]  
Bhandarkar M., 2010, IEEE INT S PARALLEL, P1, DOI DOI 10.1109/IPDPS.2010.5470377
[3]  
Chohan N., 2010, USENIX HOTCLOUD
[4]  
Dean J., 2004, 6 S OP SYST DES IMPL
[5]  
Hingave Hemant, IEEE SPONS 2 INT C E
[6]  
Narkhede Sayalee, 2013, INT J UBICOMP IJU, V4
[7]  
Parte Bharat, 2015, J INNOVATIVE RES ENG, V2
[8]   The Hadoop Distributed Filesystem: Balancing Portability and Performance [J].
Shafer, Jeffrey ;
Rixner, Scott ;
Cox, Alan L. .
2010 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2010), 2010, :122-133
[9]  
Surya A, 2013, 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES (ICT 2013), P776
[10]   Mining event logs with SLCT and LogHound [J].
Vaarandi, Risto .
2008 IEEE NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, VOLS 1 AND 2, 2008, :1071-1074