Distributed RDFS Reasoning with MapReduce

被引:0
作者
Cetin, Yigit [1 ]
Abul, Osman [1 ]
机构
[1] TOBB Univ Econ & Technol, Dept Comp Engn, Ankara, Turkey
来源
INFORMATION SCIENCES AND SYSTEMS 2014 | 2014年
关键词
Big data; Mapreduce; Hadoop; Rdfs reasoning;
D O I
10.1007/978-3-319-09465-6_32
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We live in big data age in which many computational tasks either generate or need to use large datasets. This makes parallel and distributed computing a key for scalability. MapReduce is a programming model for processing large datasets in parallel and distributed fashion on cluster of computers. Today, since the size and complexity of RDFS documents increase rapidly, RDFS reasoning problem has to embrace and address the big data solutions. The output of RDFS reasoning job can be input to another job and the output of RDFS reasoning jobs grow big as the input documents gets bigger. In this study, an indexing method is proposed to speed up the RDFS reasoning over Hadoop clusters. We also explore the utility of caching and Hadoop ecosystem tools Apache Hive and Apache Pig for this task. Experimental evaluations on Dbpedia and Freebase datasets show that the indexing method is quite effective and offers scalable solutions. Performance of caching and Apache Hive is found acceptable too.
引用
收藏
页码:305 / 313
页数:9
相关论文
共 11 条
[1]  
[Anonymous], 2004, RDF VOCABULARY DESCR
[2]  
[Anonymous], 2012, Hadoop: The definitive guide
[3]   The Semantic Web - A new form of Web content that is meaningful to computers will unleash a revolution of new possibilities [J].
Berners-Lee, T ;
Hendler, J ;
Lassila, O .
SCIENTIFIC AMERICAN, 2001, 284 (05) :34-+
[4]  
Dean S. G. J., 2004, 6 S OP SYST DES IMPL
[5]  
Husain Mohammad Farhan, 2010, 2010 IEEE 3rd International Conference on Cloud Computing (CLOUD 2010), P1, DOI 10.1109/CLOUD.2010.36
[6]  
Jianling Sun, 2010, 2010 3rd International Conference on Advanced Computer Theory and Engineering (ICACTE 2010), P633, DOI 10.1109/ICACTE.2010.5578937
[7]  
Papailiou Nikolaos, 2013, 2013 IEEE International Conference on Big Data, P255, DOI 10.1109/BigData.2013.6691582
[8]   Hive - A Petabyte Scale Data Warehouse Using Hadoop [J].
Thusoo, Ashish ;
Sen Sarma, Joydeep ;
Jain, Namit ;
Shao, Zheng ;
Chakka, Prasad ;
Zhang, Ning ;
Antony, Suresh ;
Liu, Hao ;
Murthy, Raghotham .
26TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING ICDE 2010, 2010, :996-1005
[9]  
Urbani J, 2009, LECT NOTES COMPUT SC, V5823, P634, DOI 10.1007/978-3-642-04930-9_40
[10]  
Weaver J, 2009, LECT NOTES COMPUT SC, V5823, P682, DOI 10.1007/978-3-642-04930-9_43