Defining Key Semantics for the RDF Datasets: Experiments and Evaluations

被引:9
作者
Atencia, Manuel [1 ,4 ]
Chein, Michel [2 ,4 ]
Croitoru, Madalina [2 ,4 ]
David, Jerome [1 ,4 ]
Leclere, Michel [2 ,4 ]
Pernelle, Nathalie [3 ]
Sais, Fatiha [3 ]
Scharffe, Francois [2 ]
Symeonidou, Danai [3 ]
机构
[1] Univ Grenoble Alpes, LIG, Grenoble, France
[2] Univ Montpellier 2, LIRMM, F-34095 Montpellier 5, France
[3] Univ Paris Sud, LRI, Paris, France
[4] Inria, Le Chesnay, France
来源
GRAPH-BASED REPRESENTATION AND REASONING | 2014年 / 8577卷
关键词
D O I
10.1007/978-3-319-08389-6_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many techniques were recently proposed to automate the linkage of RDF datasets. Predicate selection is the step of the linkage process that consists in selecting the smallest set of relevant predicates needed to enable instance comparison. We call keys this set of predicates that is analogous to the notion of keys in relational databases. We explain formally the different assumptions behind two existing key semantics. We then evaluate experimentally the keys by studying how discovered keys could help dataset interlinking or cleaning. We discuss the experimental results and show that the two different semantics lead to comparable results on the studied datasets.
引用
收藏
页码:65 / 78
页数:14
相关论文
共 21 条
[1]  
[Anonymous], 2011, P WEBDB
[2]   Large-Scale Deduplication with Constraints using Dedupalog [J].
Arasu, Arvind ;
Re, Christopher ;
Suciu, Dan .
ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, :952-963
[3]  
Atencia Manuel, 2012, Knowledge Engineering and Knowledge Management. 18th International Conference, EKAW 2012. Proceedings, P144, DOI 10.1007/978-3-642-33876-2_14
[4]  
Baxter R., 2003, ACM SIGKDD 03 WORKSH, P25, DOI DOI 10.1007/978-3-319-11257-2
[5]   Duplicate record detection: A survey [J].
Elmagarmid, Ahmed K. ;
Ipeirotis, Panagiotis G. ;
Verykios, Vassilios S. .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (01) :1-16
[6]   Data Linking for the Semantic Web [J].
Ferrara, Alfio ;
Nikolov, Andriy ;
Scharffe, Francois .
INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2011, 7 (03) :46-76
[7]  
Hu W., 2011, P WWW, P87
[8]   TANE:: An efficient algorithm for discovering functional and approximate dependencies [J].
Huhtala, Y ;
Kärkkäinen, J ;
Porkka, P ;
Toivonen, H .
COMPUTER JOURNAL, 1999, 42 (02) :100-111
[9]   Learning Expressive Linkage Rules using Genetic Programming [J].
Isele, Robert ;
Bizer, Christian .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (11) :1638-1649
[10]  
Michelson M., 2006, P 21 NATL C ARTIFICI, P440