An Analytical Study of Large SPARQL Query Logs

被引:88
作者
Bonifati, Angela [1 ]
Martens, Wim [2 ]
Timm, Thomas [2 ]
机构
[1] Lyon 1 Univ, Lyon, France
[2] Univ Bayreuth, Bayreuth, Germany
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2017年 / 11卷 / 02期
关键词
Information retrieval - Trees (mathematics) - Undirected graphs;
D O I
10.14778/3149193.3149196
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the adoption of RDF as the data model for Linked Data and the Semantic Web, query specification from end users has become more and more common in SPARQL endpoints. In this paper, we conduct an in-depth analytical study of the queries formulated by end-users and harvested from large and up-to-date query logs from a wide variety of BDF data sources. As opposed to previous studies, ours is the first assessment on a voluminous query corpus, spanning over several years and covering many representative SPARQL endpoints. Apart from the syntactical structure of the queries, that exhibits already interesting results on this generalized corpus, we drill deeper in the structural characteristics related to the graph and hypergraph representation of queries. We outline the most common shapes of queries when visually displayed as undirected graphs, and characterize their (hyper-)tree width. Moreover, we analyze the evolution of queries over time, by introducing the novel concept of a streak, i.e., a sequence of queries that appear as subsequent modifications of a seed query. Our study offers several fresh insights on the already rich query features of real SPARQL queries formulated by real users, and brings us to draw a number of conclusions and pinpoint future directions for SPARQL query evaluation, query optimization, tuning, and benchmarking.
引用
收藏
页码:149 / 161
页数:13
相关论文
共 32 条
[1]  
Aberger Christopher R., 2016, 2016 IEEE 32nd International Conference on Data Engineering: Workshops (ICDEW), P97, DOI 10.1109/ICDEW.2016.7495625
[2]   EmptyHeaded: A Relational Engine for Graph Processing [J].
Aberger, Christopher R. ;
Tu, Susan ;
Olukotun, Kunle ;
Re, Christopher .
SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, :431-446
[3]  
[Anonymous], 1977, STOC
[4]  
Arias Mario., 2011, CORR
[5]   gMark: Schema-Driven Generation of Graphs and Queries [J].
Bagan, Guillaume ;
Bonifati, Angela ;
Ciucanu, Radu ;
Fletcher, George H. L. ;
Lemay, Aurelien ;
Advokaat, Nicky .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (04) :856-869
[6]  
Bagan Guillaume, 2013, P 32 ACM SIGMOD SIGA, P261
[7]   Efficient Evaluation and Approximation of Well-designed Pattern Trees [J].
Barcelo, Pablo ;
Pichler, Reinhard ;
Skritek, Sebastian .
PODS'15: PROCEEDINGS OF THE 33RD ACM SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2015, :131-144
[8]  
Bonifati A., 2017, CORR
[9]  
CHEKURI C, 1997, INT C DAT THEOR ICDT, V1186, P56
[10]  
Fletcher G. H. L., 2017, INT C EXT DAT TECHN, P598