Parallel query processing in a polystore

被引:0
|
作者
Pavlos Kranas
Boyan Kolev
Oleksandra Levchenko
Esther Pacitti
Patrick Valduriez
Ricardo Jiménez-Peris
Marta Patiño-Martinez
机构
[1] LeanXcale,
[2] Distributed Systems Lab at Universidad Politécnica de Madrid,undefined
[3] Inria,undefined
[4] University of Montpellier,undefined
[5] CNRS,undefined
[6] LIRMM,undefined
来源
Distributed and Parallel Databases | 2021年 / 39卷
关键词
Database integration; Heterogeneous databases; Distributed and parallel databases; Polystores; Query languages; Query processing;
D O I
暂无
中图分类号
学科分类号
摘要
The blooming of different data stores has made polystores a major topic in the cloud and big data landscape. As the amount of data grows rapidly, it becomes critical to exploit the inherent parallel processing capabilities of underlying data stores and data processing platforms. To fully achieve this, a polystore should: (i) preserve the expressivity of each data store’s native query or scripting language and (ii) leverage a distributed architecture to enable parallel data integration, i.e. joins, on top of parallel retrieval of underlying partitioned datasets. In this paper, we address these points by: (i) using the polyglot approach of the CloudMdsQL query language that allows native queries to be expressed as inline scripts and combined with SQL statements for ad-hoc integration and (ii) incorporating the approach within the LeanXcale distributed query engine, thus allowing for native scripts to be processed in parallel at data store shards. In addition, (iii) efficient optimization techniques, such as bind join, can take place to improve the performance of selective joins. We evaluate the performance benefits of exploiting parallelism in combination with high expressivity and optimization through our experimental validation.
引用
收藏
页码:939 / 977
页数:38
相关论文
共 50 条
  • [31] An adaptable distributed query processing architecture
    Zhou, YL
    Ooi, BC
    Tan, KL
    Tok, WH
    DATA & KNOWLEDGE ENGINEERING, 2005, 53 (03) : 283 - 309
  • [32] On the measurability of knowledge acquisition and query processing
    Rödder, W
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2003, 33 (02) : 203 - 218
  • [33] Query optimization and execution in a parallel analytics DBMS
    Eavis, Todd
    Taleb, Ahmad
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2012, : 897 - 908
  • [34] Index Compression for BitFunnel Query Processing
    Liu, Xinyu
    Zhang, Zhaohua
    Hou, Boran
    Stones, Rebecca J.
    Wang, Gang
    Liu, Xiaoguang
    ACM/SIGIR PROCEEDINGS 2018, 2018, : 921 - 924
  • [35] (A)kNN Query Processing on the Cloud: A Survey
    Nodarakis, Nikolaos
    Rapti, Angeliki
    Sioutas, Spyros
    Tsakalidis, Athanasios K.
    Tsolis, Dimitrios
    Tzimas, Giannis
    Panagis, Yannis
    ALGORITHMIC ASPECTS OF CLOUD COMPUTING, ALGOCLOUD 2016, 2017, 10230 : 26 - 40
  • [36] Approximate query processing using wavelets
    Chakrabarti, K
    Garofalakis, M
    Rastogi, R
    Shim, K
    VLDB JOURNAL, 2001, 10 (2-3) : 199 - 223
  • [37] OPTIMIZATION OF PARALLEL QUERY EXECUTION PLANS IN XPRS
    HONG, W
    STONEBRAKER, M
    DISTRIBUTED AND PARALLEL DATABASES, 1993, 1 (01) : 9 - 32
  • [38] Anonymous Query Processing in Road Networks
    Mouratidis, Kyriakos
    Yiu, Man Lung
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2010, 22 (01) : 2 - 15
  • [39] On efficient reverse skyline query processing
    Gao, Yunjun
    Liu, Qing
    Zheng, Baihua
    Chen, Gang
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (07) : 3237 - 3249
  • [40] Effective XML Keyword Query Processing
    Lambole, Prashant R.
    Chatur, Prashant N.
    2017 INTERNATIONAL CONFERENCE OF ELECTRONICS, COMMUNICATION AND AEROSPACE TECHNOLOGY (ICECA), VOL 1, 2017, : 523 - 528