An Enhanced Pre-Processing Technique for Web Log Mining by Removing Web Robots

被引:0
|
作者
Nithya, P. [1 ]
Sumathi, P. [2 ]
机构
[1] Manonmaniam Sundaranar Univ, Tirunelveli, Tamil Nadu, India
[2] Chikkanna Govt Arts Coll, Tirupur, Tamil Nadu, India
来源
2012 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC) | 2012年
关键词
Preprocessing; Data Cleaning; Path Completion; Travel Path set; Content Path Set;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, internet becomes useful source of information in day-to-day life. It creates huge development of World Wide Web in its quantity of interchange and its size and difficulty of websites. Web Usage Mining (WUM) is one of the main applications of data mining, artificial intelligence and so on to the web data and forecast the user's visiting behaviors and obtains their interests by investigating the samples. Since WUM directly involves in large range of applications, such as, e-commerce, e-learning, Web analytics, information retrieval etc. Weblog data is one of the major sources which contain all the information regarding the users visited links, browsing patterns, time spent on a particular page or link and this information can be used in several applications like adaptive web sites, modified services, customer summary, pre-fetching, generate attractive web sites etc. There are several problems related with the existing web usage mining approaches. Existing web usage mining algorithms suffer from difficulty of practical applicability. So, a novel research is necessary for the accurate prediction of future performance of web users with rapid execution time. WUM consists of preprocessing, pattern discovery and pattern analysis. Log data is characteristically noisy and unclear. Hence, preprocessing is an essential process for effective mining process. In this paper, a novel pre-processing technique is proposed by removing local and global noise and web robots. Anonymous Microsoft Web Dataset and MSNBC.com Anonymous Web Dataset are used for estimating the proposed preprocessing technique.
引用
收藏
页码:662 / 665
页数:4
相关论文
共 50 条
  • [41] Analysis of Web Site Using Web Log Expert Tool Based on Web Data Mining
    Singh, Satya Prakash
    Meenu
    2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
  • [42] Comprehensive analysis of web log files for mining
    Verma, Vikas
    Verma, A.K.
    Bhatia, S.S.
    International Journal of Computer Science Issues, 2011, 8 (6 6-3): : 199 - 202
  • [43] Privacy in Web Search Query Log Mining
    Jones, Rosie
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 4 - 4
  • [44] A New Clustering and Preprocessing for Web Log Mining
    Maheswari, B. Uma
    Sumathi, P.
    2014 WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT 2014), 2014, : 25 - +
  • [45] Efficient web log mining for product development
    Woon, YK
    Ng, WK
    Li, X
    Lu, WF
    2003 INTERNATIONAL CONFERENCE ON CYBERWORLDS, PROCEEDINGS, 2003, : 294 - 301
  • [46] Web Log Mining for Improvement of Caching Performance
    Soonthomsutee, Rudeekom
    Luenam, Pramote
    INTERNATIONAL MULTICONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, IMECS 2012, VOL I, 2012, : 524 - +
  • [47] Sequential patterns recognition in Web Log Mining
    Lu, Lina
    Wei, Hengyi
    Yang, Yiling
    Guan, Xudong
    Xiaoxing Weixing Jisuanji Xitong/Mini-Micro Systems, 2000, 21 (05): : 481 - 483
  • [48] Log Mining to Support Web Query Expansions
    Ngok, Patrick
    Gong, Zhiguo
    ICIA: 2009 INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, VOLS 1-3, 2009, : 364 - 368
  • [49] Ethical aspects of web log data mining
    Olson, David L.
    International Journal of Information Technology and Management, 2008, 7 (02) : 190 - 200
  • [50] Frequent Pattern Mining in Web Log Data
    Ivancsy, Renata
    Vajk, Istvan
    ACTA POLYTECHNICA HUNGARICA, 2006, 3 (01) : 77 - 90