D-REPR: A Language for Describing and Mapping Diversely-Structured Data Sources to RDF

被引:11
作者
Vu, Binh [1 ]
Pujara, Jay [1 ]
Knoblock, Craig A. [1 ]
机构
[1] USC Informat Sci Inst, Marina Del Rey, CA 90292 USA
来源
PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE (K-CAP '19) | 2019年
关键词
RDF mapping; Linked Data; Knowledge Graph;
D O I
10.1145/3360901.3364449
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Publishing data sources to knowledge graphs is a complicated and laborious process as data sources are often heterogeneous, hierarchical and interlinked. As an example, food price datasets may contain product prices of various units at different markets and times, and different providers can have many choices of formats such as CSV, JSON or spreadsheet. Beyond data formats, these datasets may have differing layout, where one dataset may be organized as a row-based table or relational table (prices are in one column), while another may use a matrix table (prices are in one matrix). To address these problems, we present a novel data description language for mapping datasets to RDF. In particular, our language supports specifying the locations of source attributes in the sources, mapping of the attributes to ontologies, and simple rules to join the data of these attributes to output final RDF triples. Unlike existing approaches, our language is not restricted to specific data layouts such as the Nested Relational Model, or to specific data formats, such as spreadsheet. Our broad data description language presents a format-independent solution, allowing interlinking among multiple heterogeneous sources and representing many diverse data structures that existing tools are unable to handle.
引用
收藏
页码:189 / 196
页数:8
相关论文
共 8 条
[1]  
Becker Karin, 2015, P 3 INT WORKSH SEM S
[2]  
Colpaert, 2014, P 7 WORKSH LINK DAT, P1184
[3]  
Knoblock Craig A., 2017, ISWC 2017 16 INT SEM
[4]  
Langegger A, 2009, LECT NOTES COMPUT SC, V5823, P359, DOI 10.1007/978-3-642-04930-9_23
[5]   A SPARQL Extension for Generating RDF from Heterogeneous Formats [J].
Lefrancois, Maxime ;
Zimmermann, Antoine ;
Bakerally, Noorani .
SEMANTIC WEB ( ESWC 2017), PT I, 2017, 10249 :35-50
[6]  
Michel Franck, 2015, 11th International Conference on Web Information Systems and Technologies (WEBIST 2015). Proceedings, P443
[7]  
OConnor Martin J, 2010, OWLED, V614
[8]  
Slepicka J., 2015, P 6 INT WORKSH CONS P 6 INT WORKSHOP CON, V1426