The many faces of data-centric workflow optimization: a survey

被引:24
|
作者
Kougka G. [1 ]
Gounaris A. [1 ]
Simitsis A. [2 ]
机构
[1] Aristotle University of Thessaloniki, Thessaloníki
[2] HP Labs, Palo Alto
关键词
Data analysis; Data flows; Data science; Workflow management systems; Workflow optimization;
D O I
10.1007/s41060-018-0107-0
中图分类号
学科分类号
摘要
Workflow technology is rapidly evolving and, rather than being limited to modeling the control flow in business processes, is becoming a key mechanism to perform advanced data management, such as big data analytics. This survey focuses on data-centric workflows (or workflows for data analytics or data flows), where a key aspect is data passing through and getting manipulated by a sequence of steps. The large volume and variety of data, the complexity of operations performed, and the long time such workflows take to compute give rise to the need for optimization. In general, data-centric workflow optimization is a technology in evolution. This survey focuses on techniques applicable to workflows comprising arbitrary types of data manipulation steps and semantic inter-dependencies between such steps. Further, it serves a twofold purpose: firstly, to present the main dimensions of the relevant optimization problems and the types of optimizations that occur before flow execution and secondly, to provide a concise overview of the existing approaches with a view to highlighting key observations and areas deserving more attention from the community. © 2018, Springer International Publishing AG, part of Springer Nature.
引用
收藏
页码:81 / 107
页数:26
相关论文
共 50 条
  • [31] Data-Centric Artificial Intelligence
    Jakubik, Johannes
    Voessing, Michael
    Kuehl, Niklas
    Walk, Jannis
    Satzger, Gerhard
    BUSINESS & INFORMATION SYSTEMS ENGINEERING, 2024, 66 (04) : 507 - 515
  • [32] Practical data-centric storage
    Ee, Cheng Tien
    Ratnasamy, Sylvia
    Shenker, Scott
    USENIX ASSOCIATION PROCEEDINGS OF THE 3RD SYMPOSIUM ON NETWORKED SYSTEMS DESIGN & IMPLEMENTATION (NSDI 06), 2006, : 325 - +
  • [33] Comprehensive Survey of Security Issues & Framework in Data-Centric Cloud Applications
    Mandal S.
    Khan D.A.
    J. Eng. Sci. Technol. Rev., 2021, 1 (1-24): : 1 - 24
  • [34] Data-Centric Interactions on the Web
    Diaz, Paloma
    Hussein, Tim
    Lohmann, Steffen
    Ziegler, Juergen
    HUMAN-COMPUTER INTERACTION - INTERACT 2011, PT IV, 2011, 6949 : 726 - 727
  • [35] Data-centric storage in sensornets
    Shenker, S
    Ratnasamy, S
    Karp, B
    Govindan, R
    Estrin, D
    ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2003, 33 (01) : 137 - 142
  • [36] Data-Centric Intelligent Computing
    Shen, Jun
    Hung, Chih-Cheng
    Beydoun, Ghassan
    Li, Yan
    Guo, William
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2018, 11 (01) : 616 - 617
  • [37] Gaspar Data-Centric Framework
    Silva, Rui
    Sobral, J. L.
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2016, 2017, 10150 : 234 - 247
  • [38] DATA-CENTRIC MIXED-VARIABLE BAYESIAN OPTIMIZATION FOR MATERIALS DESIGN
    Iyer, Akshay
    Zhang, Yichi
    Prasad, Aditya
    Tao, Siyu
    Wang, Yixing
    Schadler, Linda
    Brinson, L. Catherine
    Chen, Wei
    PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2019, VOL 2A, 2020,
  • [39] Developing Data-Centric Clinical Laboratory Workflow Through the Use of Open-Source Tools
    Topcu, Deniz Ilhan
    JOURNAL OF APPLIED LABORATORY MEDICINE, 2023, 8 (01): : 7 - 10
  • [40] Bamboo: A Data-Centric, Object-Oriented Approach to Many-core Software
    Zhou, Jin
    Demsky, Brian
    ACM SIGPLAN NOTICES, 2010, 45 (06) : 388 - 399