p2pDating: Real life inspired semantic overlay networks for Web search

被引:15
作者
Parreira, Josiane Xavier [1 ]
Michel, Sebastian [1 ]
WeIkum, Gerhard [1 ]
机构
[1] Max Planck Inst Informat, D-66123 Saarbrucken, Germany
关键词
semantic overlay networks; P2P information systems; distributed PageRank computation; distributed Web search;
D O I
10.1016/j.ipm.2006.09.007
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider a network of autonomous peers forming a logically global but physically distributed search engine, where every peer has its own local collection generated by independently crawling the Web. A challenging task in such systems is to efficiently route user queries to peers that am deliver high quality results and be able to rank these returned results, thus satisfying the users' information need. However, the problem inherent with this scenario is selecting a few promising peers out of an a priori unlimited number of peers. In recent research a rather strict notion of semantic overlay networks has been established. In most approaches, peers are connected to other peers based on a rigid semantic profile by clustering them based on their contents. In contrast, our strategy follows the spirit of peer autonomy and creates semantic overlay networks based on the notion of "peer-to-peer dating". Peers arc free to decide which connections they create and which they want to avoid based on various usefulness estimators. The proposed techniques can be easily integrated into existing systems as they require only small additional bandwidth consumption as most messages can be piggybacked onto established communication. We show how we can greatly benefit from these additional semantic relations during query routing in search engines, such as Minerva, and in the JXP algorithm, which computes the PageRank authority measure in a completely decentralized manner. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:643 / 664
页数:22
相关论文
共 45 条
  • [1] Improving data access in P2P systems
    Aberer, K
    Punceva, M
    Hauswirth, M
    Schmidt, R
    [J]. IEEE INTERNET COMPUTING, 2002, 6 (01) : 58 - 67
  • [2] ABERER K, 2004, GRIDVINE BUILDING IN
  • [3] ATNASAMY S, 2001, ACM SIGCOMM, P161
  • [4] AWA M, 2003, SIGIR, P306
  • [5] BENDER M, 2005, BTW
  • [6] BENDER M, 2004, SIGIR WORKSH P2P IR
  • [7] BENDER M, 2005, P VLDB C
  • [8] SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS
    BLOOM, BH
    [J]. COMMUNICATIONS OF THE ACM, 1970, 13 (07) : 422 - &
  • [9] The anatomy of a large-scale hypertextual Web search engine
    Brin, S
    Page, L
    [J]. COMPUTER NETWORKS AND ISDN SYSTEMS, 1998, 30 (1-7): : 107 - 117
  • [10] Broder A. Z., 1998, Proceedings of the Thirtieth Annual ACM Symposium on Theory of Computing, P327, DOI 10.1145/276698.276781