Applying Process Mining on Scientific Workflows: A Case Study on High Performance Computing Data

被引:0
作者
Sadeghibogar, Zahra [1 ]
Berti, Alessandro [1 ]
Pegoraro, Marco [1 ]
van der Aalst, Wil M. P. [1 ]
机构
[1] Rhein Westfal TH Aachen, Proc & Data Sci, Aachen, Germany
来源
BUSINESS PROCESS MANAGEMENT WORKSHOPS, BPM 2024 | 2025年 / 534卷
关键词
High Performance Computing; SLURM; Scientific workflow; Process mining;
D O I
10.1007/978-3-031-78666-2_7
中图分类号
F [经济];
学科分类号
02 ;
摘要
Computer-based scientific experiments are becoming increasingly data-intensive, necessitating the use of High-Performance Computing (HPC) clusters to handle large scientific workflows. These workflows result in complex data and control flows within the system, making analysis challenging. This paper focuses on the extraction of case IDs from SLURM-based HPC cluster logs, a crucial step for applying mainstream process mining techniques. The core contribution is the development of methods to correlate jobs in the system, whether their interdependencies are explicitly specified or not. We present our log extraction and correlation techniques, supported by experiments that validate our approach, enabling comprehensive documentation of workflows and identification of performance bottlenecks.
引用
收藏
页码:84 / 96
页数:13
相关论文
共 9 条
  • [1] Workflows and e-Science: An overview of workflow system features and capabilities
    Deelman, Ewa
    Gannon, Dennis
    Shields, Matthew
    Taylor, Ian
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (05): : 528 - 540
  • [2] PIKA: Center-Wide and Job-Aware Cluster Monitoring
    Dietrich, Robert
    Winkler, Frank
    Knuepfer, Andreas
    Nagel, Wolfgang
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2020), 2020, : 424 - 432
  • [3] Kunz P., 2022, HPC Job-Monitoring with SLURM, Prometheus, and Grafana
  • [4] MAP: A Visual Analytics System for Job Monitoring and Analysis
    Pal, Ashish
    Malakar, Preeti
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2020), 2020, : 442 - 448
  • [5] Reng Zeng, 2011, Proceedings of the 2011 IEEE World Congress on Services (SERVICES 2011), P169, DOI 10.1109/SERVICES.2011.55
  • [6] Sadeghibogar Z., 2023, CEUR WORKSHOP P, V3469, P97
  • [7] Scientific Workflow Mining in Clouds
    Song, Wei
    Chen, Fangfei
    Jacobsen, Hans-Arno
    Xia, Xiaoxu
    Ye, Chunyang
    Ma, Xiaoxing
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2017, 28 (10) : 2979 - 2992
  • [8] van der Aalst W.M.P., 2022, LNBIP, V448
  • [9] Yoo AB, 2003, LECT NOTES COMPUT SC, V2862, P44