Efficient mining of temporal traversal patterns from very large Web logs

被引:0
作者
Chen, ZX [1 ]
机构
[1] Univ Texas Pan Amer, Dept Comp Sci, Edinburg, TX 78539 USA
来源
DMIN '05: PROCEEDINGS OF THE 2005 INTERNATIONAL CONFERENCE ON DATA MINING | 2005年
关键词
web mining; access session; temporal content page; temporal traversal pattern; suffix tree;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A Web page in a Web access session is considered as a temporal content page, if its access time is greater than the average access time of all the pages in the session. A maximal temporal reference of a Web user in an access session is a longest consecutive sequence of Web pages in the session which ends at a temporal content page and has no other temporal content pages in the sequence. The problem of efficient mining of frequent temporal traversal patterns, i.e., large temporal reference sequences of maximal temporal references, from very large Web logs is important in Web mining. This paper aims for algorithmic solutions to the problem with best possible efficiency. We first design linear time algorithms for finding maximal temporal references from Web logs. We then devise a linear time algorithm for mining frequent temporal traversal patterns, utilizing the technique developed in [8, 9] for fast construction of "shallow" generalized suffix trees over a very large alphabet.
引用
收藏
页码:10 / 16
页数:7
相关论文
共 38 条
  • [1] [Anonymous], 1996, Advances in Knowledge Discovery and Data Mining, DOI DOI 10.1007/978-3-319-31750-2.
  • [2] [Anonymous], J KNOWLEDGE INFORMAT
  • [3] [Anonymous], 2001, WORKSHOP WEB MINING
  • [4] [Anonymous], 1999, NETWORK INFORM SYST
  • [5] [Anonymous], 1997, ALGORITHMS STRINGS T, DOI DOI 10.1017/CBO9780511574931
  • [6] Analysis of navigation behaviour in web sites integrating multiple information systems
    Berendt, B
    Spiliopoulou, M
    [J]. VLDB JOURNAL, 2000, 9 (01) : 56 - 75
  • [7] BORGES J, 1999, MS99
  • [8] BUCHNER AG, 1998, ACM SIGMOD RECORD, V27, P54
  • [9] BUCHNER AG, 1999, MS99
  • [10] CATLEDGE LD, 1995, COMPUTER NETWORKS IS, P27