Towards a Multi-engine Query Optimizer for Complex SQL Queries on Big Data

被引:0
作者
Kassela, Evdokia [1 ]
Konstantinou, Ioannis [1 ]
Koziris, Nectarios [1 ]
机构
[1] Natl Tech Univ Athens, CSLab, Athens, Greece
来源
2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA) | 2019年
关键词
big data analytics; SQL; multi-engine; optimizer;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In an era where big data analytics has become a first-class requirement for both the industrial and the academic conummity, multiple engines are built to execute distributed domain-specific analytics. SQL-based big data analytics is a very popular but also challenging domain due to its complexity that requires multiple runtime query optimizations. Popular frameworks, such as Presto and SparkSQL, commonly retrieve data from multiple sources and process them locally using domain-specific optimizers. However, recent work indicates that no single engine offers the optimal all-in-one solution for all types of SQL queries. Taking this into account, we envision building an optimizer to facilitate faster distributed SQL analytics over multiple engines, which will perform operator-level optimization using Machine Learning techniques and will exploit the sophisticated data-driven local engine opthnizations.
引用
收藏
页码:6095 / 6097
页数:3
相关论文
共 17 条
  • [1] [Anonymous], 2010, P ACM SIGMOD INT C M, DOI DOI 10.1145/1807167.1807273
  • [2] Spark SQL: Relational Data Processing in Spark
    Armbrust, Michael
    Xin, Reynold S.
    Lian, Cheng
    Huai, Yin
    Liu, Davies
    Bradley, Joseph K.
    Meng, Xiangrui
    Kaftan, Tomer
    Franklint, Michael J.
    Ghodsi, Ali
    Zaharia, Matei
    [J]. SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 1383 - 1394
  • [3] Borthakur D, 2007, The Hadoop Distributed File System: Architecture and Design, V11, P21
  • [4] Dasgupta S, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), P2555, DOI 10.1109/BigData.2016.7840897
  • [5] IReS: Intelligent, Multi-Engine Resource Scheduler for Big Data Analytics Workflows
    Doka, Katerina
    Papailiou, Nikolaos
    Tsoumakos, Dimitrios
    Mantas, Christos
    Koziris, Nectarios
    [J]. SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 1451 - 1456
  • [6] The BigDAWG Polystore System
    Duggan, Jennie
    Elmore, Aaron J.
    Stonebraker, Michael
    Balazinska, Magda
    Howe, Bill
    Kepner, Jeremy
    Madden, Sam
    Maier, David
    Mattson, Tim
    Zdonik, Stan
    [J]. SIGMOD RECORD, 2015, 44 (02) : 11 - 16
  • [7] SQL-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures
    Floratou, Avrilia
    Minhas, Umar Farooq
    Ozcan, Fatma
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (12): : 1295 - 1306
  • [8] Giannakouris V, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), P452, DOI 10.1109/BigData.2016.7840636
  • [9] The CloudMdsQL Multistore System
    Kolev, Boyan
    Bondiombouy, Carlyna
    Valduriez, Patrick
    Jimenez-Peris, Ricardo
    Pau, Raquel
    Pereira, Jose
    [J]. SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 2113 - 2116
  • [10] MISO: Souping Up Big Data Query Processing with a Multistore System
    LeFevre, Jeff
    Sankaranarayanan, Jagan
    Hacigumus, Hakan
    Tatemura, Junichi
    Polyzotis, Neoklis
    Carey, Michael J.
    [J]. SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 1591 - 1602