Efficient Ontology-Based Data Integration with Canonical IRIs

被引:9
作者
Xiao, Guohui [1 ]
Hovland, Dag [2 ]
Bilidas, Dimitris [3 ]
Rezk, Martin [4 ]
Giese, Martin [2 ]
Calvanese, Diego [1 ]
机构
[1] Free Univ Bozen Bolzano, Fac Comp Sci, Bolzano, Italy
[2] Univ Oslo, Dept Informat, Oslo, Norway
[3] Natl & Kapodistrian Univ Athens, Athens, Greece
[4] Rakuten, Tokyo, Japan
来源
SEMANTIC WEB (ESWC 2018) | 2018年 / 10843卷
关键词
D O I
10.1007/978-3-319-93417-4_45
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study how to efficiently integrate multiple relational databases using an ontology-based approach. In ontology-based data integration (OBDI) an ontology provides a coherent view of multiple databases, and SPARQL queries over the ontology are rewritten into (federated) SQL queries over the underlying databases. Specifically, we address the scenario where records with different identifiers in different databases can represent the same entity. The standard approach in this case is to use sameAs to model the equivalence between entities. However, the standard semantics of sameAs may cause an exponential blow up of query results, since all possible combinations of equivalent identifiers have to be included in the answers. The large number of answers is not only detrimental to the performance of query evaluation, but also makes the answers difficult to understand due to the redundancy they introduce. This motivates us to propose an alternative approach, which is based on assigning canonical IRIs to entities in order to avoid redundancy. Formally, we present our approach as a new SPARQL entailment regime and compare it with the sameAs approach. We provide a prototype implementation and evaluate it in two experiments: in a real-world data integration scenario in Statoil and in an experiment extending the Wisconsin benchmark. The experimental results show that the canonical IRI approach is significantly more scalable.
引用
收藏
页码:697 / 713
页数:17
相关论文
共 27 条
  • [1] Ontology-Based Data Access for Maritime Security
    Brueggemann, Stefan
    Bereta, Konstantina
    Xiao, Guohui
    Koubarakis, Manolis
    [J]. SEMANTIC WEB: LATEST ADVANCES AND NEW DOMAINS, 2016, 9678 : 741 - 757
  • [2] Tractable reasoning and efficient query answering in description logics:: The DL-Lite family
    Calvanese, Diego
    De Giacomo, Giuseppe
    Lembo, Domenico
    Lenzerini, Maurizio
    Rosati, Riccardo
    [J]. JOURNAL OF AUTOMATED REASONING, 2007, 39 (03) : 385 - 429
  • [3] Calvanese D, 2017, SEMANT WEB, V8, P471, DOI 10.3233/SW-160217
  • [4] Ontology-Based Integration of Cross-Linked Datasets
    Calvanese, Diego
    Giese, Martin
    Hovland, Dag
    Rezk, Martin
    [J]. SEMANTIC WEB - ISWC 2015, PT I, 2015, 9366 : 199 - 216
  • [5] Calvanese D, 2008, LECT NOTES COMPUT SC, V4925, P26, DOI 10.1007/978-3-540-88594-8_2
  • [6] Chawathe S.S., 1994, PRCOEEDINGS ACM T CO, P7
  • [7] Chronis Y., 2016, P WORKSH EDBT ICDT 2
  • [8] Das S., 2012, R2RML: RDB to RDF Mapping Language, DOI DOI 10.1017/CBO9781107415324.004
  • [9] DeWitt D.J., 1992, BENCHMARK HDB
  • [10] The BigDAWG Polystore System
    Duggan, Jennie
    Elmore, Aaron J.
    Stonebraker, Michael
    Balazinska, Magda
    Howe, Bill
    Kepner, Jeremy
    Madden, Sam
    Maier, David
    Mattson, Tim
    Zdonik, Stan
    [J]. SIGMOD RECORD, 2015, 44 (02) : 11 - 16