Sampling business process event logs with guarantees

被引:0
作者
Su, Xuan [1 ]
Liu, Cong [1 ,2 ,4 ]
Zhang, Shuaipeng [3 ]
Zeng, Qingtian [2 ]
机构
[1] Shandong Univ Technol, Sch Comp Sci & Technol, Zibo, Peoples R China
[2] Shandong Univ Sci & Technol, Coll Comp Sci & Engn, Qingdao, Peoples R China
[3] Shandong Univ, Sch Software, Jinan, Peoples R China
[4] Shandong Univ Technol, Sch Comp Sci & Technol, Zibo 255000, Peoples R China
关键词
process mining; model discovery; event log sampling; behavior equivalence; efficiency; PROCESS MODELS; DISCOVERY;
D O I
10.1002/cpe.8077
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Event log sampling has emerged as a key research focus in the field of process mining, aiming to enhance the efficiency of various process mining tasks, including model discovery, conformance checking, and process prediction. However, current log sampling techniques often fail to ensure high-quality sample logs. This paper introduces a novel framework to support efficient event log sampling without compromising the quality of the sample log compared to the original one. The approach revolves around the consideration of directly-follows relation (DFR) among business tasks as the fundamental behavior unit of an event log. By ensuring the DFR equivalence between the original and sample logs, the proposed technique addresses the challenge of sample log quality from the model discovery point of view. The framework is instantiated by seven distinct sampling strategies each has its own specialty and is fully implemented in the open-source process mining tool platform ProM. To validate its effectiveness, we conducted a comprehensive experimental evaluation using 12 publicly available real-life event logs against state-of-the-art sampling techniques. The results clearly demonstrate that our technique significantly improves model discovery efficiency while upholding high quality of the discovered models.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Mining Event Logs to Assist the Development of Executable Process Variants
    Nguyen Ngoc Chan
    Yongsiriwit, Karn
    Gaaloul, Walid
    Mendling, Jan
    ADVANCED INFORMATION SYSTEMS ENGINEERING (CAISE 2014), 2014, 8484 : 548 - 563
  • [32] Auditing Between Event Logs and Process Trees
    Li, Hongxia
    Hou, Haixia
    Du, Yuyue
    Liu, Zhi
    DIGITAL TV AND MULTIMEDIA COMMUNICATION, 2019, 1009 : 227 - 237
  • [33] Learning Accurate Business Process Simulation Models from Event Logs via Automated Process Discovery and Deep Learning
    Camargo, Manuel
    Dumas, Marlon
    Gonzalez-Rojas, Oscar
    ADVANCED INFORMATION SYSTEMS ENGINEERING (CAISE 2022), 2022, : 55 - 71
  • [34] Optimal process mining of timed event logs
    De Oliveira, Hugo
    Augusto, Vincent
    Jouaneton, Baptiste
    Lamarsalle, Ludovic
    Prodel, Martin
    Xie, Xiaolan
    INFORMATION SCIENCES, 2020, 528 : 58 - 78
  • [35] Lucent Process Models and Translucent Event Logs
    van der Aalst, Wil M. P.
    FUNDAMENTA INFORMATICAE, 2019, 169 (1-2) : 151 - 177
  • [36] ANOMALY DETECTION ALGORITHMS IN BUSINESS PROCESS LOGS
    Bezerra, Fabio
    Wainer, Jacques
    ICEIS 2008: PROCEEDINGS OF THE TENTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL AIDSS: ARTIFICIAL INTELLIGENCE AND DECISION SUPPORT SYSTEMS, 2008, : 11 - 18
  • [37] Fraud Detection on Event Logs of Goods and Services Procurement Business Process Using Heuristics Miner Algorithm
    Rahmawati, Dewi
    Yaqin, Muhammad Ainul
    Sarno, Riyanarto
    PROCEEDINGS OF 2016 INTERNATIONAL CONFERENCE ON INFORMATION & COMMUNICATION TECHNOLOGY AND SYSTEMS (ICTS), 2016, : 249 - 254
  • [38] Inferring the Repetitive Behaviour from Event Logs for Process Mining Discovery
    Tapia-Flores, Tonatiuh
    Lopez-Mellado, Ernesto
    MINING INTELLIGENCE AND KNOWLEDGE EXPLORATION (MIKE 2016), 2017, 10089 : 164 - 173
  • [39] A Profile Clustering Based Event Logs Repairing Approach for Process Mining
    Xu, Jiuyun
    Liu, Jie
    IEEE ACCESS, 2019, 7 : 17872 - 17881
  • [40] Discrete modeling and simulation of business processes using event logs
    Khodyrev, Ivan
    Popova, Svetlana
    2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 322 - 331