A web page usage prediction scheme using sequence indexing and clustering techniques

被引:18
作者
Dimopoulos, Costantinos [1 ]
Makris, Christos [1 ]
Panagis, Yannis [1 ]
Theodoridis, Evangelos [1 ]
Tsakalidis, Athanasios [1 ]
机构
[1] Univ Patras, Comp Engn & Informat Dept, Rion 26500, Greece
关键词
World Wide Web; Web mining; On-line web page recommendation; Weighted sequences; PATTERNS;
D O I
10.1016/j.datak.2009.04.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we consider the problem of web page usage prediction in a web site by modeling users' navigation history and web page content with weighted suffix trees. This user's navigation prediction can be exploited either in an on-line recommendation system in a web site or in a web page cache system. The method proposed has the advantage that it demands a constant amount of computational effort per one user's action and consumes a relatively small amount of extra memory space. These features make the method ideal for an on-line working environment. Finally, we have performed an evaluation of the proposed scheme with experiments on various web site log files and web pages and we have found that its quality performance is fairly well and in many cases an outperforming one. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:371 / 382
页数:12
相关论文
共 50 条
  • [1] [Anonymous], 2001, Clickstream clustering using weighted longest common subsequences
  • [2] [Anonymous], 1984, P 1984 ACM SIGMOD IN
  • [3] [Anonymous], 2003, Data Mining: Introductory and Advanced Topics
  • [4] [Anonymous], P 4 INT C DAT WAR KN
  • [5] [Anonymous], P PAC AS C KNOWL DIS
  • [6] [Anonymous], 1997, ACM SIGACT NEWS
  • [7] The anatomy of a large-scale hypertextual Web search engine
    Brin, S
    Page, L
    [J]. COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 107 - 117
  • [8] CHARACTERIZING BROWSING STRATEGIES IN THE WORLD-WIDE-WEB
    CATLEDGE, LD
    PITKOW, JE
    [J]. COMPUTER NETWORKS AND ISDN SYSTEMS, 1995, 27 (06): : 1065 - 1073
  • [9] CHAKRABARTI S, 1999, P 8 INT WORLD WID WE
  • [10] CHEN M, 2002, P 25 ANN INT ACM SIG