Co-scheduling Ensembles of In Situ Workflows

被引:3
作者
Tu Mai Anh Do [1 ]
Pottier, Loic [1 ]
da Silva, Rafael Ferreira [2 ]
Suter, Frederic [2 ]
Caino-Lores, Silvina [3 ]
Taufer, Michela [3 ]
Deelman, Ewa [1 ]
机构
[1] Univ Southern Calif, Inst Informat Sci, Marina Del Rey, CA 90292 USA
[2] Oak Ridge Natl Lab, Oak Ridge, TN USA
[3] Univ Tennessee, Knoxville, TN USA
来源
2022 IEEE/ACM WORKSHOP ON WORKFLOWS IN SUPPORT OF LARGE-SCALE SCIENCE, WORKS | 2022年
关键词
workflow ensemble; in situ; co-scheduling; molecular dynamics; high-performance computing;
D O I
10.1109/WORKS56498.2022.00011
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Molecular dynamics (MD) simulations are widely used to study large-scale molecular systems. HPC systems are ideal platforms to run these studies, however, reaching the necessary simulation timescale to detect rare processes is challenging, even with modern supercomputers. To overcome the timescale limitation, the simulation of a long MD trajectory is replaced by multiple short-range simulations that are executed simultaneously in an ensemble of simulations. Analyses are usually co-scheduled with these simulations to efficiently process large volumes of data generated by the simulations at runtime, thanks to in situ techniques. Executing a workflow ensemble of simulations and their in situ analyses requires efficient coscheduling strategies and sophisticated management of computational resources so that they are not slowing down each other. In this paper, we propose an efficient method to co-schedule simulations and in situ analyses such that the makespan of the workflow ensemble is minimized. We present a novel approach to allocate resources for a workflow ensemble under resource constraints by using a theoretical framework modeling the workflow ensemble's execution. We evaluate the proposed approach using an accurate simulator based on the WRENCH simulation framework on various workflow ensemble configurations. Results demonstrate the significance of co-scheduling simulations and in situ analyses that couple data together to benefit from data locality, in which inefficient scheduling decisions can lead to slowdown in makespan up to a factor of 30.
引用
收藏
页码:43 / 51
页数:9
相关论文
共 24 条
  • [1] Flux: Overcoming scheduling challenges for exascale workflows
    Ahn, Dong H.
    Bass, Ned
    Chu, Albert
    Garlick, Jim
    Grondona, Mark
    Herbein, Stephen
    Ingolfsson, Helgi I.
    Koning, Joseph
    Patki, Tapasya
    Scogland, Thomas R. W.
    Springmeyer, Becky
    Taufer, Michela
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 110 : 202 - 213
  • [2] Amdahl G. M., 1967, PROC APRIL 18 20 196, P483, DOI [10.1145/1465482.1465560, DOI 10.1145/1465482.1465560]
  • [3] [Anonymous], 1997, IBM Research Report
  • [4] Modeling high-throughput applications for in situ analytics
    Aupy, Guillaume
    Goglin, Brice
    Honore, Valentin
    Raffin, Bruno
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2019, 33 (06) : 1185 - 1200
  • [5] Co-scheduling HPC workloads on cache-partitioned CMP platforms
    Aupy, Guillaume
    Benoit, Anne
    Goglin, Brice
    Pottier, Loic
    Robert, Yves
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2018, : 348 - 358
  • [6] Dynamic Co-scheduling Driven by Main Memory Bandwidth Utilization
    Breitbart, Jens
    Pickartz, Simon
    Lankes, Stefan
    Monti, Antonello
    Weidendorfer, Josef
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2017, : 400 - 409
  • [7] Developing accurate and scalable simulators of production workflow management systems with WRENCH
    Casanova, Henri
    Silva, Rafael Ferreira da
    Tanaka, Ryan
    Pandey, Suraj
    Jethwani, Gautam
    Koch, William
    Albrecht, Spencer
    Oeth, James
    Suter, Frederic
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 112 : 162 - 175
  • [8] Versatile, scalable, and accurate simulation of distributed applications and platforms
    Casanova, Henri
    Giersch, Arnaud
    Legrand, Arnaud
    Quinson, Martin
    Suter, Frederic
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (10) : 2899 - 2917
  • [9] Multiple-Replica Strategies for Free-Energy Calculations in NAMD: Multiple-Walker Adaptive Biasing Force and Walker Selection Rules
    Comer, Jeffrey
    Phillips, James C.
    Schulten, Klaus
    Chipot, Christophe
    [J]. JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2014, 10 (12) : 5276 - 5285
  • [10] Dauwe D, 2014, INT C PARALLEL DISTR