A new method for automatic performance comparison of search engines

被引:10
作者
Li L. [1 ]
Shang Y. [1 ]
机构
[1] Department of Computer Engineering & Computer Science, University of Missouri–Columbia, Columbia, 65211, MO
关键词
Friedman statistic; performance comparison; probability of win; relevance evaluation; search engine;
D O I
10.1023/A:1018790907285
中图分类号
学科分类号
摘要
In this paper, we present a new method for automatically comparing the performance, such as precision, of search engines. Based on queries randomly selected from a specific domain of interest, the method uses robots to automatically query the target search engines, evaluates the relevance of the returned links to the query either automatically based on the vector space model or manually, and then applies statistic measures, including the probability of win and the Friedman statistic, to compare the performance of search engines. We show the experimental results of the new method on three search engines, AltaVista, Google, and InfoSeek. The method arrived at the same performance comparison result in applying either the automatic relevance evaluation method or the manual method. In addition, our results show that the probability of win is a better metric than the Friedman statistic in performance comparison. The advantage of the new method is that it is fast, flexible, consistent, and can adapt to the fast changing search engines. © 2000, Kluwer Academic Publishers.
引用
收藏
页码:241 / 247
页数:6
相关论文
共 22 条
  • [1] Brinpage S.L., The Anatomy of a Large-Scale Hypertextual Web Search Engine, Proc. of the 7Th International World Wide Web Conference, (1998)
  • [2] Churosenthal H.M., Search Engines for the World Wide Web: A Comparative Study and Evaluation Methodology, ASIS'96: Proc. of the 59Th ASIS Annual Meeting, pp. 127-135, (1996)
  • [3] Clarke S.J., Willett P., Estimating the Recall Performance of Web Search Engines, Aslib Proceedings, pp. 184-189, (1997)
  • [4] Cohen A., Web Portal and Search Sites, PC Magazine, 10, pp. 120-156, (1999)
  • [5] Conover W.J., Practical Nonparametric Statistics, (1971)
  • [6] Devore J.L., Probability and Statistics for Engineering and the Sciences, (1982)
  • [7] Ding W., Marchionini G., A Comparative Study ofWeb Search Service Performance, ASIS'96: Proc. of the 59th ASIS Annual Meeting, pp. 136-141, (1996)
  • [8] Dong X., Su L.T., Search Engines on the World Wide Web and Information Retrieval from the Internet: A Review and Evaluation, Online and CDROM Review 21, 2, pp. 67-81, (1997)
  • [9] Gauchwang S.G., Information Fusion with Profusion, (1996)
  • [10] Gravano L., Garcia-Molina H., Tomasic A., GlOSS: Text-Source Discovery over the Internet, ACM Transactions on Database Systems 24, 2, pp. 229-264, (1999)