SSW: A small-world-based overlay for peer-to-peer search

被引:30
作者
Li, Mei [1 ]
Lee, Wang-Chien [2 ]
Sivasubramaniam, Anand [2 ]
Zhao, Jing [3 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
[2] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
[3] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Kowloon, Hong Kong, Peoples R China
基金
美国国家科学基金会;
关键词
distributed systems; distributed indexing; overlay; Internet content sharing; peer-to-peer systems;
D O I
10.1109/TPDS.2007.70757
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Peer-to-peer (P2P) systems have become a popular platform for sharing and exchanging voluminous information among thousands or even millions of users. The massive amount of information shared in such systems mandates efficient semantic-based search instead of key-based search. The majority of existing proposals can only support simple key-based search rather than semantic-based search. This paper presents the design of an overlay network, namely, semantic small world (SSW), that facilitates efficient semantic-based search in P2P systems. SSW achieves the efficiency based on four ideas: 1) semantic clustering, where peers with similar semantics organize into peer clusters, 2) dimension reduction, where to address the high maintenance overhead associated with capturing high-dimensional data semantics in the overlay, peer clusters are adaptively mapped to a one-dimensional naming space, 3) small world network, where peer clusters form into a one-dimensional small world network, which is search efficient with low maintenance overhead, and 4) efficient search algorithms, where peers perform efficient semantic-based search, including approximate point query and range query in the proposed overlay. Extensive experiments using both synthetic data and real data demonstrate that SSW is superior to the state of the art on various aspects, including scalability, maintenance overhead, adaptivity to distribution of data and locality of interest, resilience to peer failures, load balancing, and efficiency in support of various types of queries on data objects with high dimensions.
引用
收藏
页码:735 / 749
页数:15
相关论文
共 24 条
[1]  
[Anonymous], 2003, SIGIR 03 P 26 ANN IN, DOI DOI 10.1145/860435.860491
[2]  
[Anonymous], P 15 INT C WORLD WID
[3]  
[Anonymous], P VLDB 05 VLDB END
[4]  
Aspnes J, 2003, SIAM PROC S, P384
[5]  
Bawa M., 2005, P 14 INT C WORLD WID, P651, DOI DOI 10.1145/1060745.1060840
[6]   Matrices, vector spaces, and information retrieval [J].
Berry, MW ;
Drmac, Z ;
Jessup, ER .
SIAM REVIEW, 1999, 41 (02) :335-362
[7]  
BHARAMBE AR, 2004, P ACM SIGCOMM, P353
[8]  
CHAWATHE Y, 2005, P 2005 C APPL TECHN, P97
[9]  
Iamnitchi A, 2002, LECT NOTES COMPUT SC, V2429, P232
[10]  
JAGADISH HV, 2005, P 31 INT C VER LARG, P661