EABlock: A Declarative Entity Alignment Block for Knowledge Graph Creation Pipelines

被引:4
作者
Jozashoori, Samaneh [1 ]
Sakor, Ahmad [1 ]
Iglesias, Enrique [2 ]
Vidal, Maria-Esther [1 ]
机构
[1] Leibniz Univ Hannover, TIB Leibniz Informat Ctr Sci & Technol, Hannover, Germany
[2] Leibniz Univ Hannover, L3S Res Ctr, Hannover, Germany
来源
37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING | 2022年
关键词
Knowledge Graph Creation; Semantic Data Integration; Entity Alignment; Mapping Rules; Functional Mappings;
D O I
10.1145/3477314.3507132
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Despite encoding enormous amount of rich and valuable data, existing data sources are mostly created independently, being a significant challenge to their integration. Mapping languages, e.g., RML and R2RML, facilitate declarative specification of the process of applying meta-data and integrating data into a knowledge graph. Mapping rules can also include knowledge extraction functions in addition to expressing correspondences among data sources and a unified schema. Combining mapping rules and functions represents a powerful formalism to specify pipelines for integrating data into a knowledge graph transparently. Surprisingly, these formalisms are not fully adapted, and many knowledge graphs are created by executing ad-hoc programs to pre-process and integrate data. In this paper, we present EABlock, an approach integrating Entity Alignment (EA) as part of RML mapping rules. EABlock includes a block of functions performing entity recognition from textual attributes and link the recognized entities to the corresponding resources in Wikidata, DBpedia, and domain specific thesaurus, e.g., UMLS. EABlock provides agnostic and efficient techniques to evaluate the functions and transfer the mappings to facilitate its application in any RML-compliant engine. We have empirically evaluated EABlock performance, and results indicate that EABlock speeds up knowledge graph creation pipelines that require entity recognition and linking in state-of-the-art RML-compliant engines. EABlock is also publicly available as a tool through a GitHub repository and a DOI.
引用
收藏
页码:1908 / 1916
页数:9
相关论文
共 25 条
  • [21] Transforming Heterogeneous Data into Knowledge for Personalized Treatments—A Use Case
    Vidal, Maria-Esther
    Endris, Kemele M.
    Jazashoori, Samaneh
    Sakor, Ahmad
    Rivas, Ariam
    [J]. Datenbank-Spektrum, 2019, 19 (02) : 95 - 106
  • [22] Wikidata: A Free Collaborative Knowledgebase
    Vrandecic, Denny
    Kroetzsch, Markus
    [J]. COMMUNICATIONS OF THE ACM, 2014, 57 (10) : 78 - 85
  • [23] D-REPR: A Language for Describing and Mapping Diversely-Structured Data Sources to RDF
    Vu, Binh
    Pujara, Jay
    Knoblock, Craig A.
    [J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE (K-CAP '19), 2019, : 189 - 196
  • [24] Comment: The FAIR Guiding Principles for scientific data management and stewardship
    Wilkinson, Mark D.
    Dumontier, Michel
    Aalbersberg, IJsbrand Jan
    Appleton, Gabrielle
    Axton, Myles
    Baak, Arie
    Blomberg, Niklas
    Boiten, Jan-Willem
    Santos, Luiz Bonino da Silva
    Bourne, Philip E.
    Bouwman, Jildau
    Brookes, Anthony J.
    Clark, Tim
    Crosas, Merce
    Dillo, Ingrid
    Dumon, Olivier
    Edmunds, Scott
    Evelo, Chris T.
    Finkers, Richard
    Gonzalez-Beltran, Alejandra
    Gray, Alasdair J. G.
    Groth, Paul
    Goble, Carole
    Grethe, Jeffrey S.
    Heringa, Jaap
    't Hoen, Peter A. C.
    Hooft, Rob
    Kuhn, Tobias
    Kok, Ruben
    Kok, Joost
    Lusher, Scott J.
    Martone, Maryann E.
    Mons, Albert
    Packer, Abel L.
    Persson, Bengt
    Rocca-Serra, Philippe
    Roos, Marco
    van Schaik, Rene
    Sansone, Susanna-Assunta
    Schultes, Erik
    Sengstag, Thierry
    Slater, Ted
    Strawn, George
    Swertz, Morris A.
    Thompson, Mark
    van der Lei, Johan
    van Mulligen, Erik
    Velterop, Jan
    Waagmeester, Andra
    Wittenburg, Peter
    [J]. SCIENTIFIC DATA, 2016, 3
  • [25] Zeng Kaisheng, 2021, CURRENT DEV ABRASIVE