Detecting Identical Entities in the Semantic Web Data

被引:0
|
作者
Holub, Michal [1 ]
Proksa, Ondrej [1 ]
Bielikova, Maria [1 ]
机构
[1] Slovak Univ Technol Bratislava, Inst Informat & Software Engn, Fac Informat & Informat Technol, Bratislava 84216, Slovakia
来源
SOFSEM 2015: THEORY AND PRACTICE OF COMPUTER SCIENCE | 2015年 / 8939卷
关键词
duplicates; identity; similarity; relationship; semantic web; owl:sameAs; Linked Data; web of data;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large amount of entities published by various sources inevitably introduces inaccuracies, mainly duplicated information. These can even be found within a single dataset. In this paper we propose a method for automatic discovery of identity relationship between two entities (also known as instance matching) in a dataset represented as a graph (e.g. in the Linked Data Cloud). Our method can be used for cleaning existing datasets from duplicates, validating of existing identity relationships between entities within a dataset, or for connecting different datasets using the owl:sameAs relationship. Our method is based on the analysis of sub-graphs formed by entities, their properties and existing relationships between them. It can learn a common similarity threshold for particular dataset, so it is adaptable to its different properties. We evaluated our method by conducting several experiments on data from the domains of public administration and digital libraries.
引用
收藏
页码:519 / 530
页数:12
相关论文
共 50 条
  • [1] Web of data and web of entities: Identity and reference in interlinked data in the semantic web
    Bouquet P.
    Stoermer H.
    Vignolo M.
    Philosophy & Technology, 2012, 25 (1) : 5 - 26
  • [2] Disambiguating named entities by semantic web
    Azari, Ideh
    Koohpeyma, Fateme
    PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND SERVICE SYSTEM (CSSS), 2014, 109 : 741 - 744
  • [3] Data journalism and the semantic web
    Anton Bravo, Adolfo
    CIC-CUADENOS DE INFORMACION Y COMUNICACION, 2013, 18 : 99 - 116
  • [4] Data Linking for the Semantic Web
    Ferrara, Alfio
    Nikolov, Andriy
    Scharffe, Francois
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2011, 7 (03) : 46 - 76
  • [5] Data Conflict Resolution among Same Entities in Web of Data
    Askarizade, Mojgan
    Nematbakhsh, Mohammad Ali
    Jam, Enseih Davoodi
    BRAIN-BROAD RESEARCH IN ARTIFICIAL INTELLIGENCE AND NEUROSCIENCE, 2012, 3 (03): : 18 - 24
  • [6] Linking Semantic Desktop Data to the Web of Data
    Dragan, Laura
    Delbru, Renaud
    Groza, Tudor
    Handschuh, Siegfried
    Decker, Stefan
    SEMANTIC WEB - ISWC 2011, PT II, 2011, 7032 : 33 - +
  • [7] Enrichment of the Dataset of Joint Educational Entities with the Web of Data
    Limongelli, Carla
    Lombardi, Matteo
    Marani, Alessandro
    Taibi, Davide
    2017 IEEE 17TH INTERNATIONAL CONFERENCE ON ADVANCED LEARNING TECHNOLOGIES (ICALT), 2017, : 528 - 529
  • [8] A Taste of Linked Data and the Semantic Web
    Hyland-Wood, David
    Zaidman, Marsha
    SIGCSE 12: PROCEEDINGS OF THE 43RD ACM TECHNICAL SYMPOSIUM ON COMPUTER SCIENCE EDUCATION, 2011, : 658 - 658
  • [9] SSONDE: Semantic Similarity on LiNked Data Entities
    Albertoni, Riccardo
    De Martino, Monica
    METADATA AND SEMANTICS RESEARCH, 2012, 343 : 25 - +
  • [10] Automatic Integration of Spatial Data into the Semantic Web
    Prudhomme, Claire
    Homburg, Timo
    Ponciano, Jean-Jacques
    Boochs, Frank
    Roxin, Ana
    Cruz, Christophe
    WEBIST: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, 2017, : 107 - 115