Research on mining user browsing patterns in large web logs based on Poisson sampling and Sequence Alignment Method
被引:0
作者:
Liu, Peiqian
论文数: 0引用数: 0
h-index: 0
机构:
Henan Polytech Univ Jiaozuo, Sch Comp Sci & Technol, Kaifeng 454003, Peoples R ChinaHenan Polytech Univ Jiaozuo, Sch Comp Sci & Technol, Kaifeng 454003, Peoples R China
Liu, Peiqian
[1
]
An, Jiyu
论文数: 0引用数: 0
h-index: 0
机构:
Henan Polytech Univ Jiaozuo, Sch Comp Sci & Technol, Kaifeng 454003, Peoples R ChinaHenan Polytech Univ Jiaozuo, Sch Comp Sci & Technol, Kaifeng 454003, Peoples R China
An, Jiyu
[1
]
Guo, Hairu
论文数: 0引用数: 0
h-index: 0
机构:
Henan Polytech Univ Jiaozuo, Sch Comp Sci & Technol, Kaifeng 454003, Peoples R ChinaHenan Polytech Univ Jiaozuo, Sch Comp Sci & Technol, Kaifeng 454003, Peoples R China
Guo, Hairu
[1
]
机构:
[1] Henan Polytech Univ Jiaozuo, Sch Comp Sci & Technol, Kaifeng 454003, Peoples R China
来源:
2008 PROCEEDINGS OF INFORMATION TECHNOLOGY AND ENVIRONMENTAL SYSTEM SCIENCES: ITESS 2008, VOL 4
|
2008年
关键词:
data mining;
Poisson sampling;
Sequence Alignment Method (SAM);
D O I:
暂无
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
The continuous growth in the size and use of the Internet is creating large sever logs on web servers and difficulties in the search for information. A sophisticated method to organize the layout of the information and assist user navigation is therefore particularly important. In this paper, we valuate the feasibility of using a Poisson sampling and SAM to mine large web log data. A sample sets selected by Poisson sampling statistically effectively represent the characteristics of the entire dataset. In addition, users are partitioned into clusters using a non-Euclidean distance measure, called Sequence Alignment Method (SAM).
引用
收藏
页码:1119 / 1121
页数:3
相关论文
共 5 条
[1]
BILINKIS I, 1992, RANDOMIZED SIGNAL PR
[2]
CALINSKI R, 1994, COMMUNICATIONS STAT, V3, P1
[3]
Hay Birgit, 2003, J RETAIL CONSUM SERV, V10, P145
[4]
MANNILA H, 1997, 4 INT WORKSH TEMP RE, V10, P136