An Algebraic Approach for Data-Centric Scientific Workflows

被引:0
|
作者
Ogasawara, Eduardo [1 ,2 ]
Dias, Jonas [1 ]
de Oliveira, Daniel [1 ]
Porto, Fabio [3 ]
Valduriez, Patrick [4 ]
Mattoso, Marta [1 ]
机构
[1] Univ Fed Rio de Janeiro, COPPE, Rio de Janeiro, Brazil
[2] CEFET RJ, Rio De Janeiro, Brazil
[3] LNCC, Petropolis, Brazil
[4] INRIA & LIRMM, Montpellier, France
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2011年 / 4卷 / 12期
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific workflows have emerged as a basic abstraction for structuring and executing scientific experiments in computational environments. In many situations, these workflows are computationally and data intensive, thus requiring execution in large-scale parallel computers. However, parallelization of scientific workflows remains low-level, ad-hoc and laborintensive, which makes it hard to exploit optimization opportunities. To address this problem, we propose an algebraic approach (inspired by relational algebra) and a parallel execution model that enable automatic optimization of scientific workflows. We conducted a thorough validation of our approach using both a real oil exploitation application and synthetic data scenarios. The experiments were run in Chiron, a data-centric scientific workflow engine implemented to support our algebraic approach. Our experiments demonstrate performance improvements of up to 226% compared to an ad-hoc workflow implementation.
引用
收藏
页码:1328 / 1339
页数:12
相关论文
共 50 条
  • [1] Methodological Approach to Data-Centric Cloudification of Scientific Iterative Workflows
    Caino-Lores, Silvina
    Lapin, Andrei
    Kropf, Peter
    Carretero, Jesus
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2016, 2016, 10048 : 469 - 482
  • [2] A framework for collecting provenance in data-centric scientific workflows
    Simmhan, Yogesh L.
    Plale, Beth
    Gannon, Dennis
    ICWS 2006: IEEE INTERNATIONAL CONFERENCE ON WEB SERVICES, PROCEEDINGS, 2006, : 427 - +
  • [3] Orchestrating Data-Centric Workflows
    Barker, Adam
    Weissman, Jon B.
    van Hemert, Jano
    CCGRID 2008: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, VOLS 1 AND 2, PROCEEDINGS, 2008, : 210 - 217
  • [4] Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes
    Hendrik Nolte
    Philipp Wieder
    Data Intelligence, 2022, 4 (02) : 426 - 438
  • [5] Realising Data-Centric Scientific Workflows with Provenance-Capturing on Data Lakes
    Nolte, Hendrik
    Wieder, Philipp
    DATA INTELLIGENCE, 2022, 4 (02) : 426 - 438
  • [6] Data-centric iteration in dynamic workflows
    Dias, Jonas
    Guerra, Gabriel
    Rochinha, Fernando
    Coutinho, Alvaro L. G. A.
    Valduriez, Patrick
    Mattoso, Marta
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2015, 46 : 114 - 126
  • [7] Extending the data model for data-centric metagenomics analysis using scientific workflows in CAMERA
    Altintas I.
    Chen J.
    Sedova M.
    Gupta A.
    Sun S.
    Lin A.W.
    Gujral M.
    Anand M.K.
    Li W.
    Grethe J.S.
    Ellisman M.
    Proceedings - 6th IEEE International Conference on e-Science Workshops, e-ScienceW 2010, 2010, : 49 - 56
  • [8] A Data-Centric Framework for Composable NLP Workflows
    Liu, Zhengzhong
    Ding, Guanxiong
    Bukkittu, Avinash
    Gupta, Mansi
    Gao, Pengzhi
    Ahmed, Atif
    Zhang, Shikun
    Gao, Xin
    Singhavi, Swapnil
    Li, Linwei
    Wei, Wei
    Hu, Zecong
    Shi, Haoran
    Liang, Xiaodan
    Mitamura, Teruko
    Xing, Eric P.
    Hu, Zhiting
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING: SYSTEM DEMONSTRATIONS, 2020, : 197 - 204
  • [9] A Planning-Based Service Composition Approach for Data-Centric Workflows
    Lopez-Enriquez, Carlos-Manuel
    Cuevas-Vicenttin, Victor
    Vargas-Solar, Genoveva
    Collet, Christine
    Zechinelli-Martini, Jose-Luis
    SERVICE-ORIENTED COMPUTING - ICSOC 2014 WORKSHOPS, 2015, 8954 : 129 - 143
  • [10] In-memory staging and data-centric task placement for coupled scientific simulation workflows
    Zhang, Fan
    Jin, Tong
    Sun, Qian
    Romanus, Melissa
    Bui, Hoang
    Klasky, Scott
    Parashar, Manish
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (12):