GoFast: Graph-based optimization for efficient and scalable query evaluation

被引:4
|
作者
Zouaghi, Ishaq [1 ,3 ]
Mesmoudi, Amin [2 ]
Galicia, Jorge [1 ]
Bellatreche, Ladjel [1 ]
Aguili, Taoufik [3 ]
机构
[1] LIAS ISAE ENSMA, Chasseneuil, France
[2] Univ Poitiers, LIAS, Poitiers, France
[3] LR SysCom ENIT UTM, Tunis, Tunisia
关键词
Optimization; RDF; SPARQL; Cardinality estimation; Cost model;
D O I
10.1016/j.is.2021.101738
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The popularity of the Resource Description Framework (RDF) and SPARQL has thrust the development of high-performance systems to manage data represented with this model. Former approaches adapted the well-established relational model applying its storage, query processing, and optimization strategies. However, the borrowed techniques from the relational model are not universally applicable in the RDF context. First, the schema-free nature of RDF induces intensive joins overheads. Also, optimization strategies trying to find the optimal join order rely on error-prone statistics unable to capture all the correlations among triples. Graph-based approaches keep the graph structure of RDF representing the data directly as a graph. Their execution model leans on graph exploration operators to find subgraph matches to a query. Even if they have shown to outperform relational-based systems in complex queries, they are barely scalable and optimization techniques are completely system dependent. Recently, some systems such as RDF_QDAG have shown that by combining graph exploration and triples clustering one can achieve a good compromise between performance and scalability. In this paper, we propose optimization strategies for this kind of RDF management systems. First, we define novel statistics collected for clusters of triples to better capture the dependencies found in the original graph. Second, we redefine an execution plan based on these logical structures which allows to represent the RDF graph exploration process. Third, we introduce an algorithm for selecting the optimal execution plan based on a customized cost model. Finally, we propose a new approach to refine the chosen plan by pruning invalid clusters that do not participate in the construction of the final query results. All our proposals are validated experimentally using well-known RDF benchmarks. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Graph-Based Transforms for Video Coding
    Egilmez, Hilmi E.
    Chao, Yung-Hsuan
    Ortega, Antonio
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 9330 - 9344
  • [32] Graph-Based RDF Data Management
    Zou L.
    Özsu M.T.
    Data Science and Engineering, 2017, 2 (1) : 56 - 70
  • [33] Graph-based prediction of missing KPIs through optimization and random forests for KPI systems
    Marvin Carl May
    Zeyu Fang
    Michael B. M. Eitel
    Nicole Stricker
    Debarghya Ghoshdastidar
    Gisela Lanza
    Production Engineering, 2023, 17 : 211 - 222
  • [34] Graph-based prediction of missing KPIs through optimization and random forests for KPI systems
    May, Marvin Carl
    Fang, Zeyu
    Eitel, Michael B. M.
    Stricker, Nicole
    Ghoshdastidar, Debarghya
    Lanza, Gisela
    PRODUCTION ENGINEERING-RESEARCH AND DEVELOPMENT, 2023, 17 (02): : 211 - 222
  • [35] Efficient RDF querying based query translation
    Tong, Qiang
    Cheng, Jing-Wei
    Zhang, Fu
    Zhang, Li-Li
    Ma, Zong-Min
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2015, 45 (05): : 1550 - 1558
  • [36] Efficient Coverage Path Planning for Mobile Disinfecting Robots Using Graph-Based Representation of Environment
    Nasirian, B.
    Mehrandezh, M.
    Janabi-Sharifi, F.
    FRONTIERS IN ROBOTICS AND AI, 2021, 8
  • [37] Extended Query Pattern Graph and Heuristics - based SPARQL Query Planning
    Song, Fuqi
    Corby, Olivier
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS 19TH ANNUAL CONFERENCE, KES-2015, 2015, 60 : 302 - 311
  • [38] Graph-based Data for Accessible Indoor Navigation
    Simon-Nagy, Gabriella
    Chalhoub, Nidal
    Fleiner, Rita
    2019 IEEE 23RD INTERNATIONAL CONFERENCE ON INTELLIGENT ENGINEERING SYSTEMS (INES 2019), 2019, : 351 - 355
  • [39] Effective and Efficient Keyword Query Interpretation Using a Hybrid Graph
    Chen, Junquan
    Xu, Kaifeng
    Wang, Haofen
    Jin, Wei
    Yu, Yong
    WEB INFORMATION SYSTEM ENGINEERING-WISE 2010, 2010, 6488 : 175 - +
  • [40] QRDF: An efficient RDF graph processing system for fast query
    Jia, Menghan
    Zhang, Yiming
    Li, Dongsheng
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (24)