Parallel query processing in a polystore

被引:0
|
作者
Pavlos Kranas
Boyan Kolev
Oleksandra Levchenko
Esther Pacitti
Patrick Valduriez
Ricardo Jiménez-Peris
Marta Patiño-Martinez
机构
[1] LeanXcale,
[2] Distributed Systems Lab at Universidad Politécnica de Madrid,undefined
[3] Inria,undefined
[4] University of Montpellier,undefined
[5] CNRS,undefined
[6] LIRMM,undefined
来源
Distributed and Parallel Databases | 2021年 / 39卷
关键词
Database integration; Heterogeneous databases; Distributed and parallel databases; Polystores; Query languages; Query processing;
D O I
暂无
中图分类号
学科分类号
摘要
The blooming of different data stores has made polystores a major topic in the cloud and big data landscape. As the amount of data grows rapidly, it becomes critical to exploit the inherent parallel processing capabilities of underlying data stores and data processing platforms. To fully achieve this, a polystore should: (i) preserve the expressivity of each data store’s native query or scripting language and (ii) leverage a distributed architecture to enable parallel data integration, i.e. joins, on top of parallel retrieval of underlying partitioned datasets. In this paper, we address these points by: (i) using the polyglot approach of the CloudMdsQL query language that allows native queries to be expressed as inline scripts and combined with SQL statements for ad-hoc integration and (ii) incorporating the approach within the LeanXcale distributed query engine, thus allowing for native scripts to be processed in parallel at data store shards. In addition, (iii) efficient optimization techniques, such as bind join, can take place to improve the performance of selective joins. We evaluate the performance benefits of exploiting parallelism in combination with high expressivity and optimization through our experimental validation.
引用
收藏
页码:939 / 977
页数:38
相关论文
共 50 条
  • [21] Query merging: Improving query subscription processing in a multicast environment
    Crespo, A
    Buyukkokten, O
    Garcia-Molina, H
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (01) : 174 - 191
  • [22] Query processing method for multiple databases with a different set of attributes
    Nishizawa, I
    Takasu, A
    Adachi, J
    SYSTEMS AND COMPUTERS IN JAPAN, 1996, 27 (11) : 29 - 40
  • [23] Common subexpression processing in multiple-query processing
    Chen, FCF
    Dunham, MH
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1998, 10 (03) : 493 - 499
  • [24] Improved Centralized XML Query Processing Using Distributed Query Workload
    Subramaniam, Samini
    Haw, Su-Cheng
    Soon, Lay-Ki
    IEEE ACCESS, 2021, 9 : 29127 - 29142
  • [25] Query processing in a DBMS for cluster systems
    A. V. Lepikhov
    L. B. Sokolinsky
    Programming and Computer Software, 2010, 36 : 205 - 215
  • [26] QUERY PROCESSING IN DISTRIBUTED DATABASE SYSTEMS
    HEVNER, AR
    YAO, SB
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1979, 5 (03) : 177 - 187
  • [27] Temporal Query Processing in Social Network
    Xiaoying Chen
    Chong Zhang
    Bin Ge
    Weidong Xiao
    Journal of Intelligent Information Systems, 2017, 49 : 147 - 166
  • [28] Approximate query processing using wavelets
    Chakrabarti K.
    Garofalakis M.
    Rastogi R.
    Shim K.
    The VLDB Journal, 2001, 10 (2) : 199 - 223
  • [29] Diversification on big data in query processing
    Meifan Zhang
    Hongzhi Wang
    Jianzhong Li
    Hong Gao
    Frontiers of Computer Science, 2020, 14
  • [30] Toward efficient multifeature query processing
    Jagadish, HV
    Ooi, BC
    Shen, HT
    Tan, KL
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (03) : 350 - 362