Research on mining user browsing patterns in large web logs based on Poisson sampling and Sequence Alignment Method

被引:0
作者
Liu, Peiqian [1 ]
An, Jiyu [1 ]
Guo, Hairu [1 ]
机构
[1] Henan Polytech Univ Jiaozuo, Sch Comp Sci & Technol, Kaifeng 454003, Peoples R China
来源
2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 4 | 2008年
关键词
data mining; Poisson sampling; Sequence Alignment Method (SAM);
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The continuous growth in the size and use of the Internet is creating large sever logs on web servers and difficulties in the search for information. A sophisticated method to organize the layout of the information and assist user navigation is therefore particularly important. In this paper, we valuate the feasibility of using a Poisson sampling and SAM to mine large web log data. A sample sets selected by Poisson sampling statistically effectively represent the characteristics of the entire dataset. In addition, users are partitioned into clusters using a non-Euclidean distance measure, called Sequence Alignment Method (SAM).
引用
收藏
页码:1119 / 1121
页数:3
相关论文
共 5 条
  • [1] BILINKIS I, 1992, RANDOMIZED SIGNAL PR
  • [2] CALINSKI R, 1994, COMMUNICATIONS STAT, V3, P1
  • [3] Hay Birgit, 2003, J RETAIL CONSUM SERV, V10, P145
  • [4] MANNILA H, 1997, 4 INT WORKSH TEMP RE, V10, P136
  • [5] Analysis of large data logs: an application of Poisson sampling on excite web queries
    Ozmutlu, HC
    Spink, A
    Ozmutla, S
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2002, 38 (04) : 473 - 490