Sampling business process event logs with guarantees

被引:0
|
作者
Su, Xuan [1 ]
Liu, Cong [1 ,2 ,4 ]
Zhang, Shuaipeng [3 ]
Zeng, Qingtian [2 ]
机构
[1] Shandong Univ Technol, Sch Comp Sci & Technol, Zibo, Peoples R China
[2] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Peoples R China
[3] Shandong Univ, Sch Software, Jinan, Peoples R China
[4] Shandong Univ Technol, Sch Comp Sci & Technol, Zibo 255000, Peoples R China
来源
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2024年 / 36卷 / 13期
关键词
process mining; model discovery; event log sampling; behavior equivalence; efficiency; PROCESS MODELS; DISCOVERY;
D O I
10.1002/cpe.8077
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Event log sampling has emerged as a key research focus in the field of process mining, aiming to enhance the efficiency of various process mining tasks, including model discovery, conformance checking, and process prediction. However, current log sampling techniques often fail to ensure high-quality sample logs. This paper introduces a novel framework to support efficient event log sampling without compromising the quality of the sample log compared to the original one. The approach revolves around the consideration of directly-follows relation (DFR) among business tasks as the fundamental behavior unit of an event log. By ensuring the DFR equivalence between the original and sample logs, the proposed technique addresses the challenge of sample log quality from the model discovery point of view. The framework is instantiated by seven distinct sampling strategies each has its own specialty and is fully implemented in the open-source process mining tool platform ProM. To validate its effectiveness, we conducted a comprehensive experimental evaluation using 12 publicly available real-life event logs against state-of-the-art sampling techniques. The results clearly demonstrate that our technique significantly improves model discovery efficiency while upholding high quality of the discovered models.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Sampling business process event logs using graph-based ranking model
    Liu, Cong
    Pei, Yulong
    Cheng, Long
    Zeng, Qingtian
    Duan, Hua
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (05):
  • [2] Discovering Business Process Architectures from Event Logs
    Bano, Dorina
    Nikaj, Adriatik
    Weske, Mathias
    BUSINESS PROCESS MANAGEMENT FORUM (BPM 2021), 2021, 427 : 162 - 177
  • [3] Local Concurrency Detection in Business Process Event Logs
    Armas-Cervantes, Abel
    Dumas, Marlon
    La Rosa, Marcello
    Maaradji, Abderrahmane
    ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2019, 19 (01)
  • [4] Mining Business Process Stages from Event Logs
    Hoang Nguyen
    Dumas, Marlon
    ter Hofstede, Arthur H. M.
    La Rosa, Marcello
    Maggi, Fabrizio Maria
    ADVANCED INFORMATION SYSTEMS ENGINEERING (CAISE 2017), 2017, 10253 : 577 - 594
  • [5] The impact of biased sampling of event logs on the performance of process discovery
    Mohammadreza Fani Sani
    Sebastiaan J. van Zelst
    Wil M. P. van der Aalst
    Computing, 2021, 103 : 1085 - 1104
  • [6] The impact of biased sampling of event logs on the performance of process discovery
    Fani Sani, Mohammadreza
    van Zelst, Sebastiaan J.
    van der Aalst, Wil M. P.
    COMPUTING, 2021, 103 (06) : 1085 - 1104
  • [7] Discovering Structural Errors From Business Process Event Logs
    Song, Wei
    Chang, Zhen
    Jacobsen, Hans-Arno
    Zhang, Pengcheng
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2022, 34 (11) : 5293 - 5306
  • [8] A Systematic Review of Anomaly Detection for Business Process Event Logs
    Ko, Jonghyeon
    Comuzzi, Marco
    BUSINESS & INFORMATION SYSTEMS ENGINEERING, 2023, 65 (04) : 441 - 462
  • [9] Explanation of Anomalies in Business Process Event Logs with Linguistic Summaries
    Chouhan, Sudhanshu
    Wilbik, Anna
    Dijkman, Remco
    2022 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2022,
  • [10] A Systematic Review of Anomaly Detection for Business Process Event Logs
    Jonghyeon Ko
    Marco Comuzzi
    Business & Information Systems Engineering, 2023, 65 : 441 - 462