EABlock: A Declarative Entity Alignment Block for Knowledge Graph Creation Pipelines

被引:4
作者
Jozashoori, Samaneh [1 ]
Sakor, Ahmad [1 ]
Iglesias, Enrique [2 ]
Vidal, Maria-Esther [1 ]
机构
[1] Leibniz Univ Hannover, TIB Leibniz Informat Ctr Sci & Technol, Hannover, Germany
[2] Leibniz Univ Hannover, L3S Res Ctr, Hannover, Germany
来源
37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING | 2022年
关键词
Knowledge Graph Creation; Semantic Data Integration; Entity Alignment; Mapping Rules; Functional Mappings;
D O I
10.1145/3477314.3507132
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Despite encoding enormous amount of rich and valuable data, existing data sources are mostly created independently, being a significant challenge to their integration. Mapping languages, e.g., RML and R2RML, facilitate declarative specification of the process of applying meta-data and integrating data into a knowledge graph. Mapping rules can also include knowledge extraction functions in addition to expressing correspondences among data sources and a unified schema. Combining mapping rules and functions represents a powerful formalism to specify pipelines for integrating data into a knowledge graph transparently. Surprisingly, these formalisms are not fully adapted, and many knowledge graphs are created by executing ad-hoc programs to pre-process and integrate data. In this paper, we present EABlock, an approach integrating Entity Alignment (EA) as part of RML mapping rules. EABlock includes a block of functions performing entity recognition from textual attributes and link the recognized entities to the corresponding resources in Wikidata, DBpedia, and domain specific thesaurus, e.g., UMLS. EABlock provides agnostic and efficient techniques to evaluate the functions and transfer the mappings to facilitate its application in any RML-compliant engine. We have empirically evaluated EABlock performance, and results indicate that EABlock speeds up knowledge graph creation pipelines that require entity recognition and linking in state-of-the-art RML-compliant engines. EABlock is also publicly available as a tool through a GitHub repository and a DOI.
引用
收藏
页码:1908 / 1916
页数:9
相关论文
共 25 条
  • [1] DBpedia: A nucleus for a web of open data
    Auer, Soeren
    Bizer, Christian
    Kobilarov, Georgi
    Lehmann, Jens
    Cyganiak, Richard
    Ives, Zachary
    [J]. SEMANTIC WEB, PROCEEDINGS, 2007, 4825 : 722 - +
  • [2] The Unified Medical Language System (UMLS): integrating biomedical terminology
    Bodenreider, O
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D267 - D270
  • [3] Cappiello C., 2020, Dagstuhl Reports, V9, P66
  • [4] Chaves-Fraga David, 2019, OTM C
  • [5] Das S., 2012, R2RML RDB RDF MAPPIN
  • [6] An Ontology to Semantically Declare and Describe Functions
    De Meester, Ben
    Dimou, Anastasia
    Verborgh, Ruben
    Mannens, Erik
    [J]. SEMANTIC WEB, ESWC 2016, 2016, 9989 : 46 - 49
  • [7] De Meester Ben, 2017, EUROPEAN SEMANTIC WE
  • [8] Debruyne Christophe, 2016, LDOW WORKSHOP
  • [9] Dimou A., 2014, LDOW, V1184
  • [10] Dimou Anastasia, 2017, SEMANTICS ANALYTICS, P15