Modeling and optimizing large-scale data flows

被引:2
|
作者
Woehrer, Alexander [1 ]
Brezany, Peter [1 ]
Janciak, Ivan [1 ]
Mehofer, Eduard [1 ]
机构
[1] Univ Vienna, Fac Comp Sci, A-1090 Vienna, Austria
来源
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE | 2014年 / 31卷
关键词
Data-intensive research; Dataflow; Modeling; Optimization; PETRI NETS; WORKFLOWS; TAVERNA; SYNTAX; DESIGN;
D O I
10.1016/j.future.2013.10.004
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Modern scientific collaborations require large-scale integration of various processes. Higher-level dataflow languages are used on top of parallel and distributed dataflow systems to enable faster data-intensive workflow programs development, their easier optimization, and more maintainable code. In this paper, we present the rationales, design, and application of the needed advanced support for modeling and optimizing data flows for data mining and integration processes. The optimization research and development is based on dataflow pre-execution modeling and extending the registry of process activities by advanced annotations. Additionally, the overall process from a dynamic model to a static model as input for the optimization algorithms is described. This novel approach is implemented within an advanced graphical user interface, called the Process Designer, in order to support semi-automatic optimization as well as within a dataflow execution platform, called the Gateway. It can be adapted to any dataflow language implementation. The Process Designer architecture based on modern (meta-)modeling concepts naturally supports validated transformations between external textual and internal graphical representations of the targeted dataflow language, and in this way significantly increases the productivity and robustness of the implementation processes. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:12 / 27
页数:16
相关论文
共 50 条
  • [1] Modeling and Optimizing Large-Scale Wide-Area Data Transfers
    Kettimuthu, Rajkumar
    Vardoyan, Gayane
    Agrawal, Gagan
    Sadayappan, P.
    2014 14TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2014, : 196 - 205
  • [2] MODELING AND OPTIMIZING LARGE-SCALE CHROMATOGRAPHIC SEPARATIONS
    MCCOY, BJ
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1986, 191 : 109 - INDE
  • [3] Optimizing data robustness in large-scale storage systems
    Gougeaud, Sebastien
    Zertal, Soraya
    Lafoucriere, Jacques-Charles
    Deniel, Philippe
    2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 236 - 243
  • [4] Optimizing data stream processing for large-scale applications
    Cappellari, Paolo
    Roantree, Mark
    Chun, Soon Ae
    SOFTWARE-PRACTICE & EXPERIENCE, 2018, 48 (09): : 1607 - 1641
  • [5] Towards Modeling Large-Scale Data Flows in a Multidatacenter Computing System With Petri Net
    Song, Weijing
    Wang, Lizhe
    Ranjan, Rajiv
    Kolodziej, Joanna
    Chen, Dan
    IEEE SYSTEMS JOURNAL, 2015, 9 (02): : 416 - 426
  • [6] INTERPRETING NEW DATA ON LARGE-SCALE BULK FLOWS
    WATKINS, R
    FELDMAN, H
    ASTROPHYSICAL JOURNAL, 1995, 453 (02): : L73 - L76
  • [7] Topic modeling for large-scale text data
    Li, Xi-ming
    Ouyang, Ji-hong
    Lu, You
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (06) : 457 - 465
  • [8] Topic modeling for large-scale text data
    Xi-ming Li
    Ji-hong Ouyang
    You Lu
    Frontiers of Information Technology & Electronic Engineering, 2015, 16 : 457 - 465
  • [9] Modeling and Optimizing Large-Scale Production-Level Transportation Systems
    Nurminskiy E.A.
    Shamray N.B.
    Journal of Applied and Industrial Mathematics, 2022, 16 (03) : 512 - 523
  • [10] Modeling Dynamic Functional Information Flows on Large-Scale Brain Networks
    Lv, Peili
    Guo, Lei
    Hu, Xintao
    Li, Xiang
    Jin, Changfeng
    Han, Junwei
    Li, Lingjiang
    Liu, Tianming
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2013, PT II, 2013, 8150 : 698 - 705