Cross-lingual analysis of English and Chinese web search

被引:0
作者
Lin, Peiguang [1 ]
Zhang, Tong [2 ]
Xia, Menglong [3 ]
Zhou, Jin [4 ]
Nie, Peiyao [1 ]
机构
[1] Shandong Univ Finance & Econ, Sch Comp Sci & Technol, Jinan 250001, Shandong, Peoples R China
[2] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510000, Guangdong, Peoples R China
[3] Macau Univ Sci & Technol, Fac Hospitality & Tourism Management, Ave Wai Long, Taipa 999078, Macau, Peoples R China
[4] Univ Jinan, Shandong Prov Key Lab Network Based Intelligent C, Jinan 250001, Shandong, Peoples R China
关键词
cross-lingual analysis; web search analysis; search query; POS distribution; search session; session entropy; query reformulation; click graph analysis; query features; web search burstiness; ENGINE; ALGORITHM; BEHAVIOR;
D O I
10.1504/IJWGS.2018.095663
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
There is a growing number of the non-English Web in recent years. So the language-dependent and user-based search paradigms are becoming increasingly important for search engines. Unfortunately, most of the works are available on web search analysis are still English-based. In order to understand the behavioural commonality and distinction of non-English users, we propose a framework for analysing the web search behaviours of users in a cross-lingual context. This framework is composed of 10 factors, which can be applied at the query level, session level and corpus level respectively. The integral employment of these factors could help us with characterising the user behaviour of web search, even in different languages, with regard to both statistical and semantic perspectives. This framework shows a better efficiency not only in revealing the commonality and distinction of web search, but also in informing the design of search paradigms in a cross-lingual scenario.
引用
收藏
页码:376 / 399
页数:24
相关论文
共 52 条
  • [1] Adar E, 2009, CHI2009: PROCEEDINGS OF THE 27TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P1381
  • [2] Agichtein E., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P3, DOI 10.1145/1148170.1148175
  • [3] Agichtein E., 2006, Proceedings of the Twenty-Ninth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P19, DOI 10.1145/1148170.1148177
  • [4] AGRAWAL R, 1995, PROC INT CONF DATA, P3, DOI 10.1109/ICDE.1995.380415
  • [5] Allan J., 2002, Proceedings of SIGIR 2002. Twenty-Fifth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, P307
  • [6] [Anonymous], 2002, P ACM SIGKDD KDD 200
  • [7] [Anonymous], 2011, P 20 INT C WORLD WID, DOI DOI 10.1145/1963405.1963424
  • [8] Aula A, 2010, CHI2010: PROCEEDINGS OF THE 28TH ANNUAL CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, VOLS 1-4, P35
  • [9] Baraglia Ranieri, 2009, P ACM C REC SYST REC, P77
  • [10] Bendersky M., 2011, ACL, P102