Review on Modern Data Preprocessing Techniques in Web Usage Mining (WUM)

被引:0
|
作者
Sukumar, P. [1 ]
Robert, L. [1 ]
Yuvaraj, S. [1 ]
机构
[1] Govt Arts Coll, Dept CS, Coimbatore, Tamil Nadu, India
关键词
WUM; Web mining; Web usage mining; Web log mining; Data Preprocessing; Data cleaning algorithms; User Identification algorithms; Session Identification algorithms;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The web contains huge amount of data that is increasing in volume and dimension day by day. Data mining applications that make use of Web data is referred as Web mining. Web mining is one of the hot topics in the field of data mining. Web mining is classified into three types based on extracting knowledge. They are Web Structure mining, Web content mining the Web usage mining. Web usage mining process can be divided into three interdependent stages: data preprocessing, pattern discovery and pattern analysis. This paper is mainly related to web usage mining. The contribution of this paper is based on the investigation of data preprocessing and is used to determine the effectiveness of the algorithms, its limitations, and their stands are verified. Various preprocessing algorithms and its heuristics are applied and examined by implemented using programming languages. Data preprocessing algorithms are used to parse the raw log files that involve splitting of the log files and then cleansed to obtain superior quality of data. Based on this data, the unique users are identified which in turn helps to identify user sessions.
引用
收藏
页码:64 / 69
页数:6
相关论文
共 50 条
  • [31] Data collection of Web usage mining
    Xing, Dongshan
    Shen, Junyi
    Jisuanji Gongcheng/Computer Engineering, 2002, 28 (01):
  • [32] Data Preprocessing Method on Data Mining of Web Log File
    Li, Jia
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL AND INFORMATION SCIENCES (ICCIS 2014), 2014, : 712 - 717
  • [33] Enhancing Web Caching Using Web Usage Mining Techniques
    Saidi, Samia
    Slimani, Yahya
    RECENT TRENDS IN WIRELESS AND MOBILE NETWORKS, 2010, 84 : 425 - 435
  • [34] Study on data preprocessing algorithm in web log mining
    Yuan, F
    Wang, LJ
    Yu, G
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 28 - 32
  • [35] New Techniques for Data Preprocessing Based on Usage Logs for Efficient Web User Profiling at Client Side
    Choi, Jinhyuk
    Lee, Geehyuk
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 3, 2009, : 54 - 57
  • [36] A Survey Paper on Techniques and Applications of Web Usage Mining
    Jain, Subhi
    Rawat, Ruchira
    Bhandari, Bina
    2017 INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN COMPUTING AND COMMUNICATION TECHNOLOGIES (ICETCCT), 2017, : 256 - 261
  • [37] A study of Path Completion Techniques in Web Usage Mining
    Honest, Nirali
    Patel, Atul
    Patel, Bankim
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION TECHNOLOGY CICT 2015, 2015, : 670 - 675
  • [38] Web usage mining via fuzzy logic techniques
    Escobar-Jeria, Victor H.
    Martin-Bautista, Maria J.
    Sanchez, Daniel
    Vila, Maria-Amparo
    FOUNDATIONS OF FUZZY LOGIC AND SOFT COMPUTING, PROCEEDINGS, 2007, 4529 : 243 - +
  • [39] Web usage mining with intentional browsing data
    Tao, Yu-Hu
    Hong, Tzung-Pe
    Su, Yu-Ming
    EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (03) : 1893 - 1904
  • [40] Unconventional Usage of Entropy in the Field of Web Usage Data Preprocessing and Machine Translation Evaluation
    Munk, Michal
    Benko, L'ubomir
    APPLIED PHYSICS, SYSTEM SCIENCE AND COMPUTERS, 2018, 428 : 281 - 286