RDF2Vec: RDF Graph Embeddings for Data Mining

被引:226
作者
Ristoski, Petar [1 ]
Paulheim, Heiko [1 ]
机构
[1] Univ Mannheim, Data & Web Sci Grp, Mannheim, Germany
来源
SEMANTIC WEB - ISWC 2016, PT I | 2016年 / 9981卷
关键词
Graph embeddings; Linked open data; Data mining; KERNEL;
D O I
10.1007/978-3-319-46523-4_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Linked Open Data has been recognized as a valuable source for background information in data mining. However, most data mining tools require features in propositional form, i.e., a vector of nominal or numerical features associated with an instance, while Linked Open Data sources are graphs by nature. In this paper, we present RDF2Vec, an approach that uses language modeling approaches for unsupervised feature extraction from sequences of words, and adapts them to RDF graphs. We generate sequences by leveraging local information from graph substructures, harvested by Weisfeiler-Lehman Subtree RDF Graph Kernels and graph walks, and learn latent numerical representations of entities in RDF graphs. Our evaluation shows that such vector representations outperform existing techniques for the propositionalization of RDF graphs on a variety of different predictive machine learning tasks, and that feature vector representations of general knowledge graphs such as DBpedia and Wikidata can be easily reused for different tasks.
引用
收藏
页码:498 / 514
页数:17
相关论文
共 35 条
[1]  
[Anonymous], 2015, ARXIV150300759
[2]  
[Anonymous], 2 INT WORKSH KNOWL D
[3]  
[Anonymous], DMLOD
[4]  
[Anonymous], LDOW
[5]  
[Anonymous], RCOMM
[6]  
[Anonymous], 2013, WORKSH DAT MIN LINK
[7]  
[Anonymous], 2012, 2 INT C WEB INTELLIG
[8]  
[Anonymous], SEMANT WEB J
[9]  
[Anonymous], CIKM
[10]  
[Anonymous], INT SEM WEB IN PRESS