GoFast: Graph-based optimization for efficient and scalable query evaluation

被引:4
|
作者
Zouaghi, Ishaq [1 ,3 ]
Mesmoudi, Amin [2 ]
Galicia, Jorge [1 ]
Bellatreche, Ladjel [1 ]
Aguili, Taoufik [3 ]
机构
[1] LIAS ISAE ENSMA, Chasseneuil, France
[2] Univ Poitiers, LIAS, Poitiers, France
[3] LR SysCom ENIT UTM, Tunis, Tunisia
关键词
Optimization; RDF; SPARQL; Cardinality estimation; Cost model;
D O I
10.1016/j.is.2021.101738
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The popularity of the Resource Description Framework (RDF) and SPARQL has thrust the development of high-performance systems to manage data represented with this model. Former approaches adapted the well-established relational model applying its storage, query processing, and optimization strategies. However, the borrowed techniques from the relational model are not universally applicable in the RDF context. First, the schema-free nature of RDF induces intensive joins overheads. Also, optimization strategies trying to find the optimal join order rely on error-prone statistics unable to capture all the correlations among triples. Graph-based approaches keep the graph structure of RDF representing the data directly as a graph. Their execution model leans on graph exploration operators to find subgraph matches to a query. Even if they have shown to outperform relational-based systems in complex queries, they are barely scalable and optimization techniques are completely system dependent. Recently, some systems such as RDF_QDAG have shown that by combining graph exploration and triples clustering one can achieve a good compromise between performance and scalability. In this paper, we propose optimization strategies for this kind of RDF management systems. First, we define novel statistics collected for clusters of triples to better capture the dependencies found in the original graph. Second, we redefine an execution plan based on these logical structures which allows to represent the RDF graph exploration process. Third, we introduce an algorithm for selecting the optimal execution plan based on a customized cost model. Finally, we propose a new approach to refine the chosen plan by pruning invalid clusters that do not participate in the construction of the final query results. All our proposals are validated experimentally using well-known RDF benchmarks. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Graph-based Indexing Method for Searching in RDF Data
    Kyu, Khin Myat
    Oo, Aung Nway
    2019 INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION TECHNOLOGIES (ICAIT), 2019, : 96 - 101
  • [22] A graph-based cost model for supply chain reconfiguration
    Guo, Weihong
    Tian, Qi
    Jiang, Zhengqian
    Wang, Hui
    JOURNAL OF MANUFACTURING SYSTEMS, 2018, 48 : 55 - 63
  • [23] A graph-based modeling abstraction for optimization: concepts and implementation in Plasmo.jl
    Jordan Jalving
    Sungho Shin
    Victor M. Zavala
    Mathematical Programming Computation, 2022, 14 : 699 - 747
  • [24] A graph-based modeling abstraction for optimization: concepts and implementation in Plasmo.jl
    Jalving, Jordan
    Shin, Sungho
    Zavala, Victor M.
    MATHEMATICAL PROGRAMMING COMPUTATION, 2022, 14 (04) : 699 - 747
  • [25] Efficient Query Processing of Semantic Data using Graph Contraction on RDBMS
    Hayakawa, Akira
    Nishiyama, Hiroyasu
    2013 INTERNATIONAL CONFERENCE ON SIGNAL-IMAGE TECHNOLOGY & INTERNET-BASED SYSTEMS (SITIS), 2013, : 958 - 965
  • [26] Graph-Based Computational Methods for Efficient Management and Energy Conservation in Smart Cities
    Ernst, Sebastian
    Kotulski, Leszek
    Sedziwy, Adam
    Wojnicki, Igor
    ENERGIES, 2023, 16 (07)
  • [27] StarMR: An Efficient Star-Decomposition Based Query Processor for SPARQL Basic Graph Patterns Using MapReduce
    Xu, Qiang
    Wang, Xin
    Li, Jianxin
    Gan, Ying
    Chai, Lele
    Wang, Junhu
    WEB AND BIG DATA (APWEB-WAIM 2018), PT I, 2018, 10987 : 415 - 430
  • [28] Service Graph Base: A Unified Graph-based Platform for Representing and Manipulating Service Artifacts
    Chen, Xi
    Lemos, Angel Lagares
    Barukh, Moshe Chai
    Benatallah, Boualem
    2012 FIFTH IEEE INTERNATIONAL CONFERENCE ON SERVICE-ORIENTED COMPUTING AND APPLICATIONS (SOCA), 2012,
  • [29] CAUSAL GRAPH-BASED VIDEO SEGMENTATION
    Couprie, Camille
    Farabet, Clement
    Lecun, Yann
    Najman, Laurent
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 4249 - 4253
  • [30] RG-index: An RDF graph index for efficient SPARQL query processing
    Kim, Kisung
    Moon, Bongki
    Kim, Hyoung-Joo
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (10) : 4596 - 4607