An Algebraic Approach for Data-Centric Scientific Workflows

被引:0
|
作者
Ogasawara, Eduardo [1 ,2 ]
Dias, Jonas [1 ]
de Oliveira, Daniel [1 ]
Porto, Fabio [3 ]
Valduriez, Patrick [4 ]
Mattoso, Marta [1 ]
机构
[1] Univ Fed Rio de Janeiro, COPPE, Rio de Janeiro, Brazil
[2] CEFET RJ, Rio De Janeiro, Brazil
[3] LNCC, Petropolis, Brazil
[4] INRIA & LIRMM, Montpellier, France
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2011年 / 4卷 / 12期
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific workflows have emerged as a basic abstraction for structuring and executing scientific experiments in computational environments. In many situations, these workflows are computationally and data intensive, thus requiring execution in large-scale parallel computers. However, parallelization of scientific workflows remains low-level, ad-hoc and laborintensive, which makes it hard to exploit optimization opportunities. To address this problem, we propose an algebraic approach (inspired by relational algebra) and a parallel execution model that enable automatic optimization of scientific workflows. We conducted a thorough validation of our approach using both a real oil exploitation application and synthetic data scenarios. The experiments were run in Chiron, a data-centric scientific workflow engine implemented to support our algebraic approach. Our experiments demonstrate performance improvements of up to 226% compared to an ad-hoc workflow implementation.
引用
收藏
页码:1328 / 1339
页数:12
相关论文
共 50 条
  • [21] A data-centric approach to manage business processes
    Haddar, Nahla
    Tmar, Mohamed
    Gargouri, Faiez
    COMPUTING, 2016, 98 (04) : 375 - 406
  • [22] Pansharpening using data-centric optimization approach
    Devi, Mutum Bidyarani
    Devanathan, R.
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2019, 40 (20) : 7784 - 7804
  • [23] A Data-Centric Approach for Image Scene Localization
    Alfarrarjeh, Abdullah
    Kim, Seon Ho
    Rajan, Shivnesh
    Deshmukh, Akshay
    Shahabi, Cyrus
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 594 - 603
  • [24] D2WORM: A Management Infrastructure for Distributed Data-centric Workflows
    Jergler, Martin
    Sadoghi, Mohammad
    Jacobsen, Hans-Arno
    SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 1427 - 1432
  • [25] Infinite Recommendation Networks: A Data-Centric Approach
    Sachdeva, Noveen
    Dhaliwal, Mehak Preet
    Wu, Carole-Jean
    McAuley, Julian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [26] A data-centric approach to manage business processes
    Nahla Haddar
    Mohamed Tmar
    Faiez Gargouri
    Computing, 2016, 98 : 375 - 406
  • [27] DLIO: A Data-Centric Benchmark for Scientific Deep Learning Applications
    Devarajan, Hariharan
    Zheng, Huihuo
    Kougkas, Anthony
    Sun, Xian-He
    Vishwanath, Venkatram
    21ST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2021), 2021, : 81 - 91
  • [28] The Euclid Archive System: A Data-Centric Approach to Big Data
    Nieto, S.
    Belikov, A. N.
    Williams, O. R.
    Altieri, B.
    Boxhoorn, D.
    Buenadicha, G.
    Droge, B.
    McFarland, J. P.
    Salgado, J.
    de Teodoro, P.
    Tsyganov, A.
    Valentijn, E. A.
    ASTRONOMICAL DATA ANALYSIS SOFTWARE AND SYSTEMS XXVI, 2019, 521 : 12 - 15
  • [29] Data-Centric AI
    Malerba, Donato
    Pasquadibisceglie, Vincenzo
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, 62 (06) : 1493 - 1502
  • [30] Safe Distribution and Parallel Execution of Data-centric Workflows over the Publish/Subscribe Abstraction
    Jergler, Matin
    Jacobsen, Hans-Arno
    Sadoghi, Mohammad
    Hull, Richard
    Vaculin, Roman
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1498 - 1499