CBL: Exploiting Community Based Locality for Efficient Content Search Service in Online Social Networks

被引:5
作者
Chen, Hanhua [1 ]
Jin, Hai [1 ]
Zhang, Fan [1 ]
机构
[1] Huazhong Univ Sci & Technol, Serv Comp Technol & Syst Lab, Cluster & Grid Comp Lab, Sch Comp Sci & Technol, Wuhan 430074, Hubei, Peoples R China
关键词
Community-based locality; online social networks; SEGMENTATION;
D O I
10.1109/TSC.2015.2501821
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Retrieving relevant data for users in online social network (OSN) systems is a challenging problem. Cassandra, a storage systemused by popular OSN systems, such as Facebook and Twitter, relies on a DHT-based scheme to randomly partition the personal data of users among servers across multiple data centers. Although DHT is highly scalable for hosting a large number of users (personal data), it leads to costly inter-server communications across data centers due to the complex interconnection and interaction among OSN users. In this paper, we explore how to retrieve the OSN content in a cost-effective way by retaining the simple and robust nature of OSNs. Our approach exploits a simple, yet powerful principle called Community-Based Locality (CBL), which posits that if a user has a one-hop neighbor within a particular community, it is very likely that the user has other one-hop neighbors inside the same community. We demonstrate the existence of community-based locality in diverse traces of popular OSN systems such as Facebook, Orkut, Flickr, Youtube, and Livejournal. Based on the observation, we design a CBL-based algorithm to build the content index in OSNsystems. By partitioning and indexing the relevant data of users within a community on the sameserver in the data center, the CBL-based index avoids a significant amount of inter-server communications during searching, making retrieving relevant data for a user in large-scale OSNs efficient. In addition, by using CBL-based scheme we can provide much faster search response and balanced loads. We conduct comprehensive trace-driven simulations to evaluate the performance of the proposed scheme. Results show that ourscheme significantly reduces the network traffic by 73 percent while reduces the query latency by 35 percent compared with existing schemes.
引用
收藏
页码:868 / 878
页数:11
相关论文
共 34 条
[1]   Data Center TCP (DCTCP) [J].
Alizadeh, Mohammad ;
Greenberg, Albert ;
Maltz, David A. ;
Padhye, Jitendra ;
Patel, Parveen ;
Prabhakar, Balaji ;
Sengupta, Sudipta ;
Sridharan, Murari .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2010, 40 (04) :63-74
[2]  
[Anonymous], P ACM SIGCOMM 2009 C
[3]  
[Anonymous], 2013, 10 USENIX S NETW SYS
[4]  
[Anonymous], 1979, COMPUTERS INTRACTABI
[5]  
Benevenuto F, 2009, IMC'09: PROCEEDINGS OF THE 2009 ACM SIGCOMM INTERNET MEASUREMENT CONFERENCE, P49
[6]  
Bjorklund T.A., 2011, P 20 ACM INT C INFOR, P535
[7]  
Bjorklund T. A., 2010, P 2 INT WORKSH KEYW, P591
[8]   Earlybird: Real-Time Search at Twitter [J].
Busch, Michael ;
Gade, Krishna ;
Larson, Brian ;
Lok, Patrick ;
Luckenbill, Samuel ;
Lin, Jimmy .
2012 IEEE 28TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2012, :1360-1369
[9]  
Chang F, 2006, USENIX ASSOCIATION 7TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P205
[10]  
Chen C., 2011, SIGMOD, P649