Sampling business process event logs with guarantees

被引:0
|
作者
Su, Xuan [1 ]
Liu, Cong [1 ,2 ,4 ]
Zhang, Shuaipeng [3 ]
Zeng, Qingtian [2 ]
机构
[1] Shandong Univ Technol, Sch Comp Sci & Technol, Zibo, Peoples R China
[2] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Peoples R China
[3] Shandong Univ, Sch Software, Jinan, Peoples R China
[4] Shandong Univ Technol, Sch Comp Sci & Technol, Zibo 255000, Peoples R China
来源
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2024年 / 36卷 / 13期
关键词
process mining; model discovery; event log sampling; behavior equivalence; efficiency; PROCESS MODELS; DISCOVERY;
D O I
10.1002/cpe.8077
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Event log sampling has emerged as a key research focus in the field of process mining, aiming to enhance the efficiency of various process mining tasks, including model discovery, conformance checking, and process prediction. However, current log sampling techniques often fail to ensure high-quality sample logs. This paper introduces a novel framework to support efficient event log sampling without compromising the quality of the sample log compared to the original one. The approach revolves around the consideration of directly-follows relation (DFR) among business tasks as the fundamental behavior unit of an event log. By ensuring the DFR equivalence between the original and sample logs, the proposed technique addresses the challenge of sample log quality from the model discovery point of view. The framework is instantiated by seven distinct sampling strategies each has its own specialty and is fully implemented in the open-source process mining tool platform ProM. To validate its effectiveness, we conducted a comprehensive experimental evaluation using 12 publicly available real-life event logs against state-of-the-art sampling techniques. The results clearly demonstrate that our technique significantly improves model discovery efficiency while upholding high quality of the discovered models.
引用
收藏
页数:18
相关论文
共 50 条
  • [11] Log Delta Analysis: Interpretable Differencing of Business Process Event Logs
    van Beest, Nick R. T. P.
    Dumas, Marlon
    Garcia-Banuelos, Luciano
    La Rosa, Marcello
    BUSINESS PROCESS MANAGEMENT, BPM 2015, 2015, 9253 : 386 - 405
  • [12] Business Process Variant Analysis Based on Mutual Fingerprints of Event Logs
    Taymouri, Farbod
    La Rosa, Marcello
    Carmona, Josep
    ADVANCED INFORMATION SYSTEMS ENGINEERING, CAISE 2020, 2020, 12127 : 299 - 318
  • [13] Detecting anomalies in business process event logs using statistical leverage
    Ko, Jonghyeon
    Comuzzi, Marco
    Information Sciences, 2021, 549 : 53 - 67
  • [14] Using Event Logs to Model Interarrival Times in Business Process Simulation
    Martin, Niels
    Depaire, Benoit
    Caris, An
    BUSINESS PROCESS MANAGEMENT WORKSHOPS, (BPM 2015), 2016, 256 : 255 - 267
  • [15] A Semantic Framework Supporting Business Process Variability Using Event Logs
    Yongsiriwit, Karn
    Sellami, Mohamed
    Gaaloul, Walid
    PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2016), 2016, : 163 - 170
  • [16] Belief network discovery from event logs for business process analysis
    Savickas, Titas
    Vasilecas, Olegas
    COMPUTERS IN INDUSTRY, 2018, 100 : 258 - 266
  • [17] Automated discovery of business process simulation models from event logs
    Camargo, Manuel
    Dumas, Marlon
    Gonzalez-Rojas, Oscar
    DECISION SUPPORT SYSTEMS, 2020, 134
  • [18] Filtering Out Infrequent Behavior from Business Process Event Logs
    Conforti, Raffaele
    La Rosa, Marcello
    ter Hofstede, Arthur H. M.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (02) : 300 - 314
  • [19] Detecting anomalies in business process event logs using statistical leverage
    Ko, Jonghyeon
    Comuzzi, Marco
    INFORMATION SCIENCES, 2021, 549 : 53 - 67
  • [20] An empirical comparison of classification techniques for next event prediction using business process event logs
    Tama, Bayu Adhi
    Comuzzi, Marco
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 129 : 233 - 245