Discovery of User Navigation Patterns on a web site through Data Mining Algorithms

被引:0
作者
Revathy, P. [1 ]
Ramani, R. Geetha [1 ]
Jacob, Shomona Gracia [1 ]
Nancy, P. [1 ]
机构
[1] Rajalakshmi Engn Coll, Dept Comp Sci & Engn, Madras, Tamil Nadu, India
来源
2012 INTERNATIONAL CONFERENCE ON FUTURE COMMUNICATION AND COMPUTER TECHNOLOGY (ICFCCT 2012) | 2012年
关键词
Web Usage mining; Data mining; Classification; Social Networks;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Web Mining is the application of data mining algorithms to mine significant patterns from the web. Web usage mining is the technique used to identify the user's need and interest on the World Wide Web. This paper highlights the effect of classification on the server logs from the msnbc dataset for the month of September 1998. This dataset contains information on 65536 users and their navigation behavior through the 17 web pages (each focusing towards frontpage, news, tech, local, opinion, on-air, misc, weather etc,). In order to have efficient classification patterns, the original msnbc data is pre-processed, leading to subsets, each focusing towards news, health, on-air, weather, etc,. Hence the class fields in each of the training set correspond to any one of the core pages (fi-ontpage, news, etc,) resulting in 17 subsets. This research aims at identifying important patterns in the usage of web pages listed in this data and brings out the interest of the users while navigating the web site. We have evaluated the performance of eight classification algorithms on the msnbc dataset and report higher accuracy for the Quinlan's C4.5 algorithm and the Random Tree algorithm. The error-rates revealed by the algorithms indicate the usage density of the core pages (News, Weather, etc,). Moreover the misclassification rates indicate the usage style of different users on a web page with less error being generated for pages with more user hits that unearth the user navigation patterns.
引用
收藏
页码:167 / 172
页数:6
相关论文
共 13 条
[1]  
[Anonymous], 2011, EUROPEAN J SCI RES
[2]  
Bindu Madhuri Ch, 2010, INT J ENG SCI TECHNO, V2, P5402
[3]  
Breiman L., Random trees
[4]  
Chakrabarti S., 2002, MINING WEB ANAL HYPE
[5]  
Chandra B, 2011, IEEE SYS MAN CYBERN, P960, DOI 10.1109/ICSMC.2011.6083793
[6]  
Clark L, 2006, INFORM RES, V11
[7]  
Han J, 2000, Data mining: Concepts and Techniques
[8]  
Jacob Shomona Gracia, 2011, P IEEE INT C COMP IN, P661
[9]  
Kohavi Ron, 1999, DECISION TREE D 1010
[10]  
Kusiak A., 2000, P IND ENG RES 2000 C, P1