Feature selection based on the complexity of structural patterns in RDF graphs

被引:0
作者
Kaneiwa, Ken [1 ]
Minami, Yota [1 ]
机构
[1] Univ Electrocommun, Grad Sch Informat & Engn, Dept Comp & Network Engn, Tokyo, Japan
关键词
RDF; Feature selection; Graph kernel; Machine learning;
D O I
10.1007/s41060-023-00466-w
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The resource description framework (RDF) is a framework for describing metadata, such as attributes and relationships of resources on the Web. Machine learning tasks for RDF graphs adopt three methods: (i) support vector machines (SVMs) with RDF graph kernels, (ii) RDF graph embeddings, and (iii) relational graph convolutional networks. In this paper, we propose a novel feature vector (called a Skip vector) that represents some features of each resource in an RDF graph by extracting various combinations of neighboring edges and nodes. In order to make the Skip vector low-dimensional, we select important features for classification tasks based on the information gain ratio of each feature. The classification tasks can be performed by applying the low-dimensional Skip vector of each resource to conventional machine learning algorithms, such as SVMs, the k-nearest neighbors method, neural networks, random forests, and AdaBoost. In our evaluation experiments with RDF data, such as Wikidata, DBpedia, and YAGO, we compare our method with RDF graph kernels in an SVM. We also compare our method with the two approaches: RDF graph embeddings such as RDF2Vec and relational graph convolutional networks on the AIFB, MUTAG, BGS, and AM benchmarks. As a result, our proposed Skip vectors can represent the features of target resources in an RDF graph better than traditional methods and make conventional machine learning algorithms applicable to classification tasks in RDF data.
引用
收藏
页码:217 / 227
页数:11
相关论文
共 27 条
[1]  
Arai D., 2018, T JPN SOC ARTIFICIAL, V33, P1, DOI [10.1527/tjsai.B-I12, DOI 10.1527/TJSAI.B-I12]
[2]  
Arai D., 2017, T JPN SOC ARTIFICIAL, V32, P1, DOI DOI 10.1527/TJSAI.B-G34
[3]  
Bicer V, 2011, LECT NOTES COMPUT SC, V6643, P47, DOI 10.1007/978-3-642-21034-1_4
[4]  
Collins M, 2002, ADV NEUR IN, V14, P625
[5]  
de Vries Gerben K. D., 2013, Machine Learning and Knowledge Discovery in Databases. European Conference, ECML PKDD 2013. Proceedings: LNCS 8188, P606, DOI 10.1007/978-3-642-40988-2_39
[6]   Substructure counting graph kernels for machine learning from RDF data [J].
de Vries, Gerben Klaas Dirk ;
de Rooij, Steven .
JOURNAL OF WEB SEMANTICS, 2015, 35 :71-84
[7]  
Exner Peter., 2012, The Web of Linked Entities Workshop (WoLE 2012), P58
[8]  
FANIZZI N, 2006, LECT NOTES ARTIF INT, P322
[9]   Induction of robust classifiers for web ontologies through kernel machines [J].
Fanizzi, Nicola ;
d'Amato, Claudia ;
Esposito, Floriana .
JOURNAL OF WEB SEMANTICS, 2012, 11 :1-13
[10]   A Linear-time Graph Kernel [J].
Hido, Shohei ;
Kashima, Hisashi .
2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, :179-188