Processing Billions of RDF Triples on a Single Machine using Streaming and Sorting

被引:27
作者
Corcoglioniti, Francesco [1 ]
Rospocher, Marco [1 ]
Mostarda, Michele [1 ]
Amadori, Marco [1 ]
机构
[1] Fdn Bruno Kessler, Via Sommarive 18, I-38123 Trento, Italy
来源
30TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, VOLS I AND II | 2015年
关键词
RDF Processing; Streaming; Sorting; Linked Data; RDFPRO;
D O I
10.1145/2695664.2695720
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We consider the feasibility of processing billions of RDF triples on a single commodity machine using streaming and sorting techniques and focusing on RDF processing tasks relevant for Linked Data consumption: data filtering and transformation, RDFS inference, owl : sameAs smushing and statistics extraction. To investigate this research question we built RDFPRO (RDF Processor), an open source tool that provides streaming and sorting-based processors for the considered tasks and allows their sequential and parallel composition in complex pipelines. An empirical evaluation of RDFPRO in four application scenario-dataset analysis, filtering, merging and massaging-shows the effectiveness of the tool and allows to positively answer our research question.
引用
收藏
页码:368 / 375
页数:8
相关论文
共 19 条
[1]   On the streaming model augmented with a sorting primitive [J].
Aggarwal, G ;
Datar, M ;
Rajagopalan, S ;
Ruhl, M .
45TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2004, :540-549
[2]  
Alexander K, 2009, WORKSH LINK DAT WEB
[3]  
[Anonymous], 2011, LINKED DATA EVOLVING
[4]  
[Anonymous], WWW DEV TRACK
[5]  
Auer S., 2012, EKAW, P353
[6]   OWLIM: A family of scalable semantic repositories [J].
Bishop, Barry ;
Kiryakov, Atanas ;
Ognyanoff, Damyan ;
Peikov, Ivan ;
Tashev, Zdravko ;
Velkov, Ruslan .
SEMANTIC WEB, 2011, 2 (01) :33-42
[7]  
Bizer C., 2010, INT WORKSH CONS LINK
[8]   Creating voiD descriptions for Web-scale data [J].
Boehm, Christoph ;
Lorey, Johannes ;
Naumann, Felix .
JOURNAL OF WEB SEMANTICS, 2011, 9 (03) :339-345
[9]  
Ceri S., 1989, IEEE Transactions on Knowledge and Data Engineering, V1, P146, DOI 10.1109/69.43410
[10]   Binary RDF representation for publication and exchange (HDT) [J].
Fernandez, Javier D. ;
Martinez-Prieto, Miguel A. ;
Gutierrez, Claudio ;
Polleres, Axel ;
Arias, Mario .
JOURNAL OF WEB SEMANTICS, 2013, 19 :22-41