A study of results overlap and uniqueness among major Web search engines

被引:73
作者
Spink, Amanda
Jansen, Bernard J.
Blakely, Chris
Koshman, Sherry
机构
[1] Queensland Univ Technol, Fac Informat Technol, Brisbane, Qld 4001, Australia
[2] Penn State Univ, Sch Informat Sci & Technol, University Pk, PA 16802 USA
[3] Market Strategy Manager Infospace Inc, Search & Directory, Bellevue, WA 98004 USA
[4] Univ Pittsburgh, Sch Informat Sci, Pittsburgh, PA 15260 USA
关键词
Web search engine; overlap; Google; Yahoo; MSN search; Ask Jeeves; Dogpile; Infospace Inc;
D O I
10.1016/j.ipm.2005.11.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The performance and capabilities of Web search engines is an important and significant area of research. Millions of people world wide use Web search engines very day. This paper reports the results of a major study examining the overlap among results retrieved by multiple Web search engines for a large set of more than 10,000 queries. Previous smaller studies have discussed a lack of overlap in results returned by Web search engines for the same queries. The goal of the current study was to conduct a large-scale study to measure the overlap of search results on the first result page (both non-sponsored and sponsored) across the four most popular Web search engines, at specific points in time using a large number of queries. The Web search engines included in the study were MSN Search, Google, Yahoo! and Ask Jeeves. Our study then compares these results with the first page results retrieved for the same queries by the metasearch engine Dogpile.com. Two sets of randomly selected user-entered queries, one set was 10,316 queries and the other 12,570 queries, from Infospace's Dogpile.com search engine (the first set was from Dogpile, the second was from across the Infospace Network of search properties were submitted to the four single Web search engines). Findings show that the percent of total results unique to only one of the four Web search engines was 84.9%, shared by two of the three Web search engines was 11.4%, shared by three of the Web search engines was 2.6%, and shared by all four Web search engines was 1.1%. This small degree of overlap shows the significant difference in the way major Web search engines retrieve and rank results in response to given queries. Results point to the value of metasearch engines in Web retrieval to overcome the biases of individual search engines. (c) 2005 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1379 / 1391
页数:13
相关论文
共 26 条
[1]   Comparing rankings of search results on the Web [J].
Bar-Ilan, J .
INFORMATION PROCESSING & MANAGEMENT, 2005, 41 (06) :1511-1519
[2]   A technique for measuring the relative size and overlap of public Web search engines [J].
Bharat, K ;
Broder, A .
COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7) :379-388
[3]  
Buzikashvili N, 2002, LECT NOTES ARTIF INT, V2569, P226
[4]   Discriminating meta-search: a framework for evaluation [J].
Chignell, MH ;
Gwizdka, J ;
Bodner, RC .
INFORMATION PROCESSING & MANAGEMENT, 1999, 35 (03) :337-362
[5]  
DING W, 1998, P ANN C AM SOC INF S, P136
[6]   Experiences with selecting search engines using metasearch [J].
Dreilinger, D ;
Howe, AE .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1997, 15 (03) :195-222
[7]   Classical retrieval and overlap measures satisfy the requirements for rankings based on a Lorenz curve [J].
Egghe, L ;
Rousseau, R .
INFORMATION PROCESSING & MANAGEMENT, 2006, 42 (01) :106-120
[8]  
FERREIRA J, 2004, P WWW INT 2004 IADIS
[9]  
Gauch S., 1996, J UNIVERS COMPUT SCI, V2, P637
[10]   Finding information on the World Wide Web: the retrieval effectiveness of search engines [J].
Gordon, M ;
Pathak, P .
INFORMATION PROCESSING & MANAGEMENT, 1999, 35 (02) :141-180