TripleBit: a Fast and Compact System for Large Scale RDF Data

被引:99
作者
Yuan, Pingpeng [1 ]
Liu, Pu [1 ]
Wu, Buwen [1 ]
Jin, Hai [1 ]
Zhang, Wenya [1 ]
Liu, Ling [2 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Comp Sci & Technol, Serv Comp Tech & Syst Lab, Wuhan, Hubei, Peoples R China
[2] Georgia Inst Technol, Coll Comp, Sch Comp Sci, Distributed Data Intens Syst Lab, Atlanta, GA 30332 USA
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2013年 / 6卷 / 07期
基金
美国国家科学基金会;
关键词
D O I
10.14778/2536349.2536352
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The volume of RDF data continues to grow over the past decade and many known RDF datasets have billions of triples. A grant challenge of managing this huge RDF data is how to access this big RDF data efficiently. A popular approach to addressing the problem is to build a full set of permutations of (S, P, O) indexes. Although this approach has shown to accelerate joins by orders of magnitude, the large space overhead limits the scalability of this approach and makes it heavyweight. In this paper, we present TripleBit, a fast and compact system for storing and accessing RDF data. The design of TripleBit has three salient features. First, the compact design of TripleBit reduces both the size of stored RDF data and the size of its indexes. Second, TripleBit introduces two auxiliary index structures, ID-Chunk bit matrix and ID-Predicate bit matrix, to minimize the cost of index selection during query evaluation. Third, its query processor dynamically generates an optimal execution ordering for join queries, leading to fast query execution and effective reduction on the size of intermediate results. Our experiments show that TripleBit outperforms RDF-3X, MonetDB, BitMat on LUBM, UniProt and BTC 2012 benchmark queries and it offers orders of mangnitude performance improvement for some complex join queries.
引用
收藏
页码:517 / 528
页数:12
相关论文
共 23 条
[1]  
Abadi DJ, 2007, P 33 INT C VER LARG, P411, DOI 10.5555/1325851.1325900
[2]  
Atre M., 2010, P 19 INT C WORLD WID, P41, DOI 10.1145/1772690.1772696
[3]   USING SEMI-JOINS TO SOLVE RELATIONAL QUERIES [J].
BERNSTEIN, PA ;
CHIU, DMW .
JOURNAL OF THE ACM, 1981, 28 (01) :25-40
[4]  
Bonstrom V., P LA WEB 2003, P27
[5]  
Brisaboa N. R., P AMCIS 2011
[6]  
Broekstra J., 2002, ISWC, V2002, P54, DOI DOI 10.1007/3-540-48005-6_
[7]  
Harth A., P ISWC ASWC2007, P211
[8]  
Hartig O., P ESWC 2007, P564
[9]  
Huang J., PVLDB, V4, P1123
[10]  
Janik M, 2005, LECT NOTES COMPUT SC, V3729, P431, DOI 10.1007/11574620_32