Cross-validation of neural network applications for automatic new topic identification

被引:9
作者
Ozmutlu, H. Cenk [1 ]
Cavdur, Fatih [1 ]
Ozmutlu, Seda [1 ]
机构
[1] Uludag Univ, Dept Ind Engn, Gorukle, Bursa, Turkey
来源
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY | 2008年 / 59卷 / 03期
关键词
D O I
10.1002/asi.20696
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The purpose of this study is to provide results from experiments designed to investigate-the cross-validation of an artificial neural network application to automatically identify topic changes in Web search engine user sessions by using data logs of different Web search engines for training and testing the neural network. Sample data logs from the FAST and Excite search engines are used in this study. The results of the study show that identification of topic shifts and continuations on a particular Web search engine user session can be achieved with neural networks that are trained on a different Web search engine data log. Although FAST and Excite search engine users differ with respect to some user characteristics (e.g., number of queries per session, number of topics per session), the results of this study demonstrate that both search engine users display similar characteristics as they shift from one topic to another during a single search session. The key finding of this study is that a neural network that is trained on a selected data log could be universal; that is, it can be applicable on all Web search engine transaction logs regardless of the source of the training data log.
引用
收藏
页码:339 / 362
页数:24
相关论文
共 78 条
[1]  
Agichtein E., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P3, DOI 10.1145/1148170.1148175
[2]  
Agichtein E., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P19, DOI 10.1145/1148170.1148177
[3]  
[Anonymous], J MACHINE LEARNING R
[4]  
[Anonymous], P ANN INT ACM SIGIR
[5]  
[Anonymous], P 12 INT C INF KNOWL
[6]  
Beeferman D., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P407, DOI 10.1145/347090.347176
[7]  
Beitzel S. M., 2004, Proceedings of Sheffield SIGIR 2004. The Twenty-Seventh Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P321, DOI 10.1145/1008992.1009048
[8]  
Cooley R., 1999, Knowledge and Information Systems, V1, P5
[9]   DEFINITION OF RELEVANCE FOR INFORMATION RETRIEVAL [J].
COOPER, WS .
INFORMATION STORAGE AND RETRIEVAL, 1971, 7 (01) :19-&
[10]  
FENG A, 2005, 20042005 CIIR