An Enhanced Pre-Processing Technique for Web Log Mining by Removing Web Robots

被引:0
|
作者
Nithya, P. [1 ]
Sumathi, P. [2 ]
机构
[1] Manonmaniam Sundaranar Univ, Tirunelveli, Tamil Nadu, India
[2] Chikkanna Govt Arts Coll, Tirupur, Tamil Nadu, India
来源
2012 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC) | 2012年
关键词
Preprocessing; Data Cleaning; Path Completion; Travel Path set; Content Path Set;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, internet becomes useful source of information in day-to-day life. It creates huge development of World Wide Web in its quantity of interchange and its size and difficulty of websites. Web Usage Mining (WUM) is one of the main applications of data mining, artificial intelligence and so on to the web data and forecast the user's visiting behaviors and obtains their interests by investigating the samples. Since WUM directly involves in large range of applications, such as, e-commerce, e-learning, Web analytics, information retrieval etc. Weblog data is one of the major sources which contain all the information regarding the users visited links, browsing patterns, time spent on a particular page or link and this information can be used in several applications like adaptive web sites, modified services, customer summary, pre-fetching, generate attractive web sites etc. There are several problems related with the existing web usage mining approaches. Existing web usage mining algorithms suffer from difficulty of practical applicability. So, a novel research is necessary for the accurate prediction of future performance of web users with rapid execution time. WUM consists of preprocessing, pattern discovery and pattern analysis. Log data is characteristically noisy and unclear. Hence, preprocessing is an essential process for effective mining process. In this paper, a novel pre-processing technique is proposed by removing local and global noise and web robots. Anonymous Microsoft Web Dataset and MSNBC.com Anonymous Web Dataset are used for estimating the proposed preprocessing technique.
引用
收藏
页码:662 / 665
页数:4
相关论文
共 50 条
  • [31] Frequent pagesets from web log by enhanced weighted association rule mining
    Malarvizhi, S. P.
    Sathiyabhama, B.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (01): : 269 - 277
  • [32] Frequent pagesets from web log by enhanced weighted association rule mining
    S. P. Malarvizhi
    B. Sathiyabhama
    Cluster Computing, 2016, 19 : 269 - 277
  • [33] System Log Pre-processing to Improve Failure Prediction
    Zheng, Zinming
    Lan, Zhiling
    Park, Byung H.
    Geist, Al
    2009 IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS & NETWORKS (DSN 2009), 2009, : 572 - +
  • [34] A Constraint Programming Approach for Web Log Mining
    Kemmar, Amina
    Lebbah, Yahia
    Loudni, Samir
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2016, 11 (04) : 24 - 42
  • [35] Frequent Sequence Mining in Web Log Data
    Weichbroth, Pawel
    MAN-MACHINE INTERACTIONS 5, ICMMI 2017, 2018, 659 : 459 - 467
  • [36] Simple Web log mining system (SWLMS)
    Yang, Yiling
    Guan, Xudong
    Lu, Lina
    You, Jinyuan
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2000, 34 (07): : 932 - 935
  • [37] Mining Web Access Log for the Personalization Recommendation
    Peng, Xueping
    Cao, Yujuan
    Niu, Zhendong
    2008 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND INFORMATION TECHNOLOGY, PROCEEDINGS, 2008, : 172 - 175
  • [38] Web Log Mining based on Website Topic
    Yu, Xiaobing
    Guo, Shunsheng
    Peng, Zhao
    SEVENTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS I-III: UNLOCKING THE FULL POTENTIAL OF GLOBAL TECHNOLOGY, 2008, : 874 - 878
  • [39] Design and Implementation of WEB Log Mining System
    Ni, Xianjun
    2009 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND TECHNOLOGY, VOL II, PROCEEDINGS, 2009, : 425 - 427
  • [40] A HowNet based web log mining algorithm
    Li, Chen
    Qi, Jiayin
    Shu, Huaying
    RESEARCH AND PRACTICAL ISSUES OF ENTERPRISE INFORMATION SYSTEMS II, VOL 2, 2008, 255 : 923 - +