Topic profiling benchmarks in the linked open data cloud: Issues and lessons learned

被引:6
作者
Spahiu, Blerina [1 ]
Maurino, Andrea [1 ]
Meusel, Robert [2 ]
机构
[1] Univ Milano Bicocca, Dept Informat Syst & Commun, Milan, Italy
[2] Univ Mannheim, Data & Web Sci Grp, Mannheim, Germany
关键词
Benchmarking; topic classification; linked open data; LOD; topical profiling;
D O I
10.3233/SW-180323
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Topical profiling of the datasets contained in the Linking Open Data (LOD) cloud has been of interest since such kind of data became available within the Web. Different automatic classification approaches have been proposed in the past, in order to overcome the manual task of assigning topics for each and every individual (new) dataset. Although the quality of those automated approaches is comparably sufficient, it has been shown, that in most cases a single topical label per dataset does not capture the topics described by the content of the dataset. Therefore, within the following study, we introduce a machine-learning based approach in order to assign a single topic, as well as multiple topics for one LOD dataset and evaluate the results. As part of this work, we present the first multi-topic classification benchmark for LOD cloud datasets, which is freely accessible. In addition, the article discusses the challenges and obstacles, which need to be addressed when building such a benchmark.
引用
收藏
页码:329 / 348
页数:20
相关论文
共 42 条
[1]  
Aggarwal C.C., 2012, Mining Text Data, P1, DOI [DOI 10.1007/978-1-4614-3223-4_4, 10.1007/978-1-4614-3223-4, DOI 10.1007/978-1-4614-3223-4]
[2]   The Linked Data Benchmark Council: a Graph and RDF industry benchmarking effort [J].
Angles, Renzo ;
Boncz, Peter ;
Larriba-Pey, Josep ;
Fundulaki, Irini ;
Neumann, Thomas ;
Erling, Orri ;
Neubauer, Peter ;
Martinez-Bazan, Norbert ;
Kotsev, Venelin ;
Toma, Ioan .
SIGMOD RECORD, 2014, 43 (01) :27-31
[3]  
[Anonymous], 1996, Technical report
[4]  
[Anonymous], INT J SCI REIJSR
[5]  
[Anonymous], 2012, PROC 21 ACM INT C IN
[6]  
[Anonymous], DATASET PROFILING FE
[7]  
[Anonymous], SAMPLING TECHNIQUES
[8]  
[Anonymous], METHODS THEORY ALGOR
[9]  
[Anonymous], 1984, CLASSIFICATION REGRE
[10]  
[Anonymous], IJCAI 2001 WORKSHOP