Explainable Similarity of Datasets Using Knowledge Graph

被引:3
作者
Skoda, Petr [1 ]
Klimek, Jakub [1 ]
Necasky, Martin [1 ]
Skopal, Tomas [1 ]
机构
[1] Charles Univ Prague, Dept Software Engn, Fac Math & Phys, Malostranske Namesti 25, Prague 11800 1, Czech Republic
来源
SIMILARITY SEARCH AND APPLICATIONS (SISAP 2019) | 2019年 / 11807卷
关键词
Similarity; Datasets; Search; Graph;
D O I
10.1007/978-3-030-32047-8_10
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
There is a large quantity of datasets available as Open Data on the Web. However, it is challenging for users to find datasets relevant to their needs, even though the datasets are registered in catalogs such as the European Data Portal. This is because the available metadata such as keywords or textual description is not descriptive enough. At the same time, datasets exist in various types of contexts not expressed in the metadata. These may include information about the dataset publisher, the legislation related to dataset publication, language and cultural specifics, etc. In this paper we introduce a similarity model for matching datasets. The model assumes an ontology/knowledge graph, such as Wikidata.org, that serves as a graph-based context to which individual datasets are mapped based on their metadata. A similarity of the datasets is then computed as an aggregation over paths among nodes in the graph. The proposed similarity aims at addressing the problem of explainability of similarity, i.e., providing the user a structured explanation of the match which, in a broader sense, is nowadays a hot topic in the field of artificial intelligence.
引用
收藏
页码:103 / 110
页数:8
相关论文
empty
未找到相关数据