LSQ 2.0: A linked dataset of SPARQL query logs

被引:1
作者
Stadler, Claus [1 ]
Saleem, Muhammad [1 ]
Mehmood, Qaiser [2 ]
Buil-Aranda, Carlos [3 ,4 ]
Dumontier, Michel [5 ]
Hogan, Aidan [3 ,6 ]
Ngomo, Axel-Cyrille Ngonga [1 ]
机构
[1] Univ Leipzig, IFI AKSW, PO 100920, D-04009 Leipzig, Germany
[2] Natl Univ Ireland, Insight Ctr Data Analyt, Galway, Ireland
[3] IMFD, Santiago, Chile
[4] Univ Tecn Federico Santa Maria, Informat Dept, Valparaiso, Chile
[5] Maastricht Univ, Inst Data Sci, Maastricht, Netherlands
[6] Univ Chile, Dept Comp Sci, Santiago, Chile
关键词
SPARQL; Query Log Analysis; Web of Data; RDF; WEB;
D O I
10.3233/SW-223015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present the Linked SPARQL Queries (LSQ) dataset, which currently describes 43.95 million executions of 11.56 million unique SPARQL queries extracted from the logs of 27 different endpoints. The LSQ dataset provides RDF descriptions of each such query, which are indexed in a public LSQ endpoint, allowing interested parties to find queries with the characteristics they require. We begin by describing the use cases envisaged for the LSQ dataset, which include applications for research on common features of queries, for building custom benchmarks, and for designing user interfaces. We then discuss how LSQ has been used in practice since the release of four initial SPARQL logs in 2015. We discuss the model and vocabulary that we use to represent these queries in RDF. We then provide a brief overview of the 27 endpoints from which we extracted queries in terms of the domain to which they pertain and the data they contain. We provide statistics on the queries included from each log, including the number of query executions, unique queries, as well as distributions of queries for a variety of selected characteristics. We finally discuss how the LSQ dataset is hosted and how it can be accessed and leveraged by interested parties for their use cases.
引用
收藏
页码:167 / 189
页数:23
相关论文
共 102 条
[1]   ColChain: Collaborative Linked Data Networks [J].
Aebeloe, Christian ;
Montoya, Gabriela ;
Hose, Katja .
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, :1385-1396
[2]   A Dynamic, Cost-Aware, Optimized Maintenance Policy for Interactive Exploration of Linked Data [J].
Akhtar, Usman ;
Sant'Anna, Anita ;
Lee, Sungyoung .
APPLIED SCIENCES-BASEL, 2019, 9 (22)
[3]   Change-Aware Scheduling for Effectively Updating Linked Open Data Caches [J].
Akhtar, Usman ;
Razzaq, Muhammad Asif ;
Rehman, Ubaid Ur ;
Amin, Muhammad Bilal ;
Khan, Wajahat Ali ;
Huh, Eui-Nam ;
Lee, Sungyoung .
IEEE ACCESS, 2018, 6 :65862-65873
[4]   Discovery and diagnosis of wrong SPARQL queries with ontology and constraint reasoning [J].
Almendros-Jimenez, Jesus M. ;
Becerra-Teron, Antonio .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165
[5]   Building self-clustering RDF databases using Tunable-LSH [J].
Aluc, Guenes ;
Ozsu, M. Tamer ;
Daudjee, Khuzaima .
VLDB JOURNAL, 2019, 28 (02) :173-195
[6]   Workload Matters: Why RDF Databases Need a New Design [J].
Aluc, Gunes ;
Ozsu, M. Tamer ;
Daudjee, Khuzaima .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (10) :837-840
[7]  
Aluç G, 2014, LECT NOTES COMPUT SC, V8796, P197, DOI 10.1007/978-3-319-11964-9_13
[8]  
Ambrus O., 2010, Visual Interfaces to the Social and Semantic Web (VISSW)
[9]  
[Anonymous], 2011, Tech. Rep.
[10]   Reverse Engineering SPARQL Queries [J].
Arenas, Marcelo ;
Diaz, Gonzalo I. ;
Kostylev, Egor V. .
PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16), 2016, :239-249